Skip to content

Error Handling

This diagram shows how the device handles errors and recovers from failures.

flowchart TD
    Operation[Normal Operation] --> Error{Error
Detected?} Error -->|No| Operation Error -->|Yes| Classify{Error Type?} Classify -->|Sensor Error| SensorFail[Log Sensor Failure] Classify -->|Network Error| NetworkFail[Log Network Failure] Classify -->|Config Error| ConfigFail[Log Config Failure] Classify -->|Critical Error| CriticalFail[Critical Failure] SensorFail --> SkipReading[Skip This Reading] SkipReading --> Schedule[Schedule Next Attempt] Schedule --> Operation NetworkFail --> RetryJoin{Join Retry
Count?} RetryJoin -->|< Max| WaitRetry[Wait & Retry Join] RetryJoin -->|≥ Max| ContinueLocal[Continue Without Network] WaitRetry --> Operation ContinueLocal --> Operation ConfigFail --> UseDefault[Load Default Config] UseDefault --> Operation CriticalFail --> EnableWatchdog[Enable Watchdog] EnableWatchdog --> InfiniteLoop[Enter Infinite Loop] InfiniteLoop --> WatchdogReset([Watchdog Timeout
Device Resets]) WatchdogReset --> Boot([Boot Sequence]) Boot --> Operation style Error fill:#fff4e1 style CriticalFail fill:#ffe1e1 style WatchdogReset fill:#ffe1e1 style Operation fill:#e1ffe1
mindmap
  root((Errors))
    I2C Errors
      Slave NACK
      Arbitration Lost
      No Alert Response
      Bus Error
    Network Errors
      Join Failure
      TX Failure
      Link Dead
    Config Errors
      Invalid Magic
      Checksum Error
      Out of Range
    Critical Errors
      Hardware Fault
      Memory Corruption
Error TypeDetectionRecoveryImpact
Sensor NACKI2C transaction failsSkip measurement, retry next cycleOne data point lost
Sensor TimeoutNo response after triggerSkip measurement, retry next cycleOne data point lost
Join FailureOTAA join times outRetry with backoff, max 10 attemptsDelayed operation
TX FailureTransmission failsRetry on next cycleOne transmission lost
Link DeadNo response from networkContinue measuring, attempt rejoinData buffering needed
Config InvalidMagic bytes wrongUse default settingsSettings lost
Memory CorruptRead/write errorsWatchdog resetDevice restarts
Critical FaultUnrecoverable errorWatchdog resetDevice restarts
sequenceDiagram
    participant App as Application
    participant WDT as Watchdog Timer
    participant MCU as Microcontroller

    Note over App,MCU: Normal Operation
    App->>WDT: Pet watchdog (reset timer)
    WDT-->>App: OK

    Note over App: Continue working...

    App->>WDT: Pet watchdog (reset timer)
    WDT-->>App: OK

    Note over App,MCU: Critical Error Occurs

    App->>App: Detect critical error
    App->>WDT: Enable watchdog (15ms)
    App->>App: Enter infinite loop

    Note over WDT: 15ms passes...

    WDT->>MCU: Trigger reset
    MCU->>MCU: System reset

    Note over App,MCU: Device reboots

    MCU->>App: Start application
    App->>App: Initialize & resume

The device maintains an error counter for debugging:

Error Code | Count | Last Occurrence
-----------|-------|----------------
ERR_NONE | 0 | -
SLAVE_NACK | 3 | 2024-10-30 14:23:15
ARB_LOST | 0 | -
NO_ALERT | 1 | 2024-10-30 12:45:30
SMBUS_ERR | 0 | -

Errors are logged to serial output when DEBUG mode is enabled:

[12345] ERROR: Sensor NACK on address 0x36
[12348] Retrying measurement in next cycle
  • Sensor Error: Immediate (skip to next cycle)
  • Network Join Failure: 10-60 seconds per retry
  • Config Error: <1 second (load defaults)
  • Watchdog Reset: 2-5 seconds (full reboot)
  1. Always validate sensor data before transmission
  2. Retry with backoff for transient errors
  3. Use watchdog as last resort for critical failures
  4. Log errors for debugging and maintenance
  5. Fail gracefully - continue operation when possible