Device Health Monitoring

The Device Health panel gives you a live view of how your Raspberry Pi is performing. It shows CPU temperature, CPU load, memory usage, disk space, network latency, and system uptime — all updated automatically every five minutes. If any metric crosses a warning or critical threshold, the panel highlights it so you can act before something goes wrong.

You do not need to configure anything to use this feature. The health publisher runs automatically on every LoopString device and sends data to your dashboard as long as the Pi is online.

How to Find the Health Panel

The Device Health panel appears in the main dashboard, alongside your sensor and actuator cards. It is labeled "Device Health" and shows a small status dot in the top-right corner. When all metrics are within normal ranges the dot pulses green. It turns amber for warnings and red for critical alerts.

If the panel is not visible, the Pi may be offline or the health publisher may not have sent its first update yet. Wait a few minutes after the device comes online and the panel will appear automatically.

Understanding the Metrics

CPU Temperature

This is the core temperature of the Raspberry Pi's processor, measured in degrees Celsius. The Pi reads this directly from the hardware thermal sensor. Normal operating temperatures for a Raspberry Pi 4 are typically in the 40–65 °C range at moderate load. Running consistently above 70 °C indicates the Pi may need better airflow or a heatsink.

Warning threshold: 70 °C Critical threshold: 80 °C

At 80 °C the Pi's processor will begin to throttle itself to reduce heat, which can slow down your Node-RED flows and affect sensor reading reliability.

CPU Usage

This shows what percentage of the processor is currently in use, averaged over the time since the last health update. Light IoT workloads with a few sensors and a PID controller typically stay well under 30%. If you are running many Node-RED flows, protocol bridges, or video processing, usage can climb higher.

Warning threshold: 90% Critical threshold: 95%

Sustained CPU usage above 90% can cause sensor readings to arrive late and control loops to miss their timing windows.

Memory

The panel shows memory usage as a percentage, with the raw used and total figures shown below the gauges (for example, "420 / 1024 MB"). Memory used includes the operating system, Node-RED, all active flows, and any protocol bridge processes you have installed.

Warning threshold: 85% used Critical threshold: 95% used

If memory climbs above 95%, the Pi's operating system may start terminating processes to free space. Node-RED is typically the first large process affected.

Disk

Disk usage tracks how full the Pi's root storage partition is. This includes the operating system, Node-RED, log files, and any data stored locally. Disk usage grows slowly under normal operation, but can fill quickly if debug logging is left on, or if a large number of historical readings are cached locally.

Warning threshold: 85% full Critical threshold: 95% full

At 95% disk usage, Node-RED and other services can fail to write temporary files and may stop working correctly.

Network Latency

Network latency measures how long it takes for data written by the Pi to reach Firebase and be confirmed. It is measured as the elapsed time across the full health-update cycle, so it reflects not just raw internet speed but also the combined overhead of the Pi's network stack and Firebase response time.

Warning threshold: 2,000 milliseconds (2 seconds)

High latency means commands you send from the dashboard will take longer to reach the Pi, and sensor readings will arrive later than expected. If you see consistently high latency, check the Pi's internet connection or Tailscale tunnel status.

Load Average

The load average shows how many processes were waiting to use the CPU over the last one minute, expressed as a decimal number. On a single-core Pi, a load average above 1.0 means the CPU is fully occupied and tasks are queuing. On a quad-core Pi 4, values below 4.0 are generally healthy.

Load average does not have a color-coded threshold in the current release — it is shown as informational context alongside the CPU usage gauge.

Uptime

Uptime shows how long the Pi has been running since its last reboot, formatted in days, hours, and minutes. A sudden drop in uptime usually means the Pi restarted unexpectedly, which can be caused by a power interruption, a kernel update, or a system crash.

Node-RED Status

A status badge near the bottom of the panel shows whether the Node-RED process is running. If Node-RED is down, all sensor publishing, actuator control, and PID logic on that device will stop. This is one of the most critical indicators to watch.

Tailscale Status

A second status badge shows whether the Tailscale secure tunnel is active. If Tailscale goes offline, remote access to the Pi via the dashboard is still possible through Firebase, but direct SSH and local network management will not work. Flow deployments, OTA updates, and some advanced configuration tasks require Tailscale to be online.

How Thresholds and Alerts Work

Threshold values are fixed defaults built into LoopString. When a health metric crosses a warning or critical threshold, a Cloud Function records a health alert event in your device history. These events appear in your notification inbox with a bell icon, like any other alert.

Health alerts have a 15-minute cooldown per metric. If CPU temperature stays above the warning threshold for an extended period, the system will record one alert every 15 minutes rather than flooding your inbox. Health alerts are informational only — they do not trigger SMS notifications, unlike sensor alarms.

The border color at the top of the Device Health card reflects the worst current status across all metrics: green for normal, amber for warning, red for critical.

Use Cases

Preventing Thermal Shutdown in a Warm Environment

If you are running a LoopString device inside a grow tent, server closet, or any enclosed space that heats up seasonally, the CPU Temperature gauge gives you early warning before the Pi throttles or shuts down. Setting a reminder to check the health panel during summer months — or watching for warning-level amber in your notification inbox — lets you add a small fan or relocate the Pi before it affects your plants, fermentation batch, or environmental control system.

Catching Disk Fill Before It Breaks Node-RED

Over time, log files, cached flow data, and debug output accumulate on the Pi's SD card. On a 16 GB card, this can take months before it becomes a problem, but on an 8 GB card it can happen faster than expected. The Disk gauge shows you exactly how close you are, so you can log in via SSH and clear old logs before disk space causes Node-RED to fail mid-cycle. The warning threshold at 85% gives you a comfortable buffer to act before hitting the critical 95% point where services begin to malfunction.

Troubleshooting

The Health Panel Is Missing

If the Device Health panel does not appear at all on your dashboard, the most likely cause is that the Pi has not yet sent its first health snapshot. The health publisher waits five seconds after Node-RED starts before sending its first reading, then updates every five minutes. If the panel is still absent after ten minutes:

Check that the Pi is online by looking at the device status indicator at the top of the dashboard. If the device shows as offline, the Pi is not connected to Firebase and health data cannot arrive.

If the device shows as online but the health panel is still missing, try deploying your flows again from the Configurator. The health publisher subflow may not have been included in an earlier deployment. After a successful deployment, the panel should appear within five minutes.

CPU Temperature Reads as Zero

A CPU temperature of exactly 0 °C is almost always a reading failure, not a real temperature. This can happen when the thermal sensor file is momentarily unavailable on the Pi. The next health update in five minutes should resolve it. If zero persists across multiple updates, check that the Pi's operating system is not reporting any hardware errors. You can verify the sensor yourself by running the following on the Pi:

cat /sys/class/thermal/thermal_zone0/temp

The output should be a large number like 45000 (representing 45.0 °C).

Network Latency Is Very High or Shows an Unexpected Value

Network latency in the health panel is calculated as the time elapsed across the entire five-minute health collection cycle, not a dedicated ping measurement. This means the value reflects the combined time of all the system reads (CPU stats, memory, disk, Tailscale check) plus the Firebase write roundtrip. On a slow Pi under load, this can read higher than actual network latency. If you see consistently high values (above 2,000 ms) alongside high CPU usage, reducing the number of active flows may bring both figures down.

If latency is high but CPU usage is normal, check the Pi's internet connection. A weak Wi-Fi signal or a congested home network can add hundreds of milliseconds to each Firebase write.

Health Alerts Keep Appearing Even After the Issue Is Resolved

Health alerts have a 15-minute cooldown between recordings. If a metric briefly dips below a threshold and then rises back above it, a new alert will be recorded when the next health snapshot arrives. This is by design — it ensures you are notified whenever a metric is breaching, not just on the first occurrence. Once the underlying condition is genuinely resolved (for example, the Pi cools down after you improve airflow), alerts will stop appearing naturally.

If you are seeing repeated health alerts for a metric that appears normal on the dashboard, there may be a brief spike happening during the measurement window that is not visible in the five-minute snapshot. This is most common with CPU usage on devices running periodic heavy tasks.

Known Issues and Limitations

Health thresholds are not user-configurable in the current release. The warning and critical values for each metric are fixed defaults. If your deployment environment runs warmer than typical (for example, an industrial enclosure without active cooling), you may receive more frequent CPU temperature warnings than you need. A future release will allow per-device threshold customization.

The health publisher updates every five minutes, which means the panel reflects conditions as they were up to five minutes ago. For rapid diagnostic work — for example, when actively debugging a Node-RED flow that is causing high CPU — the five-minute update interval may not be frequent enough to track changes in real time. You can connect directly to the Pi via SSH for more frequent monitoring in those situations.

On Raspberry Pi Zero and Zero 2 W devices, CPU usage and load average can read higher than on Pi 3 or Pi 4 hardware running the same flows, because these models have fewer CPU cores and slower processors. The default thresholds were tuned for Pi 3 and Pi 4. You may see more frequent warning-level CPU alerts on Zero hardware, even under light load.

Device Health Monitoring

Device Health Monitoring

How to Find the Health Panel

Understanding the Metrics

CPU Temperature

CPU Usage

Memory

Disk

Network Latency

Load Average

Uptime

Node-RED Status

Tailscale Status

How Thresholds and Alerts Work

Use Cases

Preventing Thermal Shutdown in a Warm Environment

Catching Disk Fill Before It Breaks Node-RED

Troubleshooting

The Health Panel Is Missing

CPU Temperature Reads as Zero

Network Latency Is Very High or Shows an Unexpected Value

Health Alerts Keep Appearing Even After the Issue Is Resolved

Known Issues and Limitations

Related Documentation

Related Articles