Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 7512

Troubleshooting • Weird partial system hang / zombie state - Unresponsive

$
0
0
Hi,

I’ve experienced a strange issue with my setup and I can’t seem to wrap my head around it.

I’m running a Raspberry Pi 5 (8GB) with a Pimoroni NVMe Base and a Lexar NM710 M.2 500GB SSD.

I noticed that my services, including Pi-hole, became unresponsive, causing issues on my local network. When I investigated:
  • - The Pi was listening for HTTP requests but returned error responses.

    - WireGuard continued routing traffic without major problems.

    - The Pi responded to pings.

    - SSH connections were completely failing.

    - The only way to regain control was a manual hard physical reset.
Looking back at the logs I could recover, I noticed:
  • - Around 2:00 AM, CPU and memory usage, along with disk I/O, spiked significantly for about 4 minutes before returning to normal.

    - About 30 minutes later, the system entered the unresponsive state described above.

    - It may have performed a few internal restarts; about 17 minutes later, I managed to retrieve ~30 seconds of monitoring data before it froze again.

    - The system remained unresponsive until I physically returned 2 days later and done the hard reset.
Here are some technical details I managed to gather:
  • - Noticeable spike of NFS client RPC around 2:00 AM.

    - Very high CPU usage from Plex at the same time.

    - SoftIRQ spikes in NET_RX and RCU, along with disk backlog.

    - Some network traffic (~30 Mbit) was received, but nothing huge.

    - Disk read I/O spike.
The journal logs are sparse, but relevant entries include:

Code:

Failed to start nvmf-autoconnect.service - Connect NVMe-oF subsystems automatically during boot.nvme nvme0: missing or invalid SUBNQN field.nvme nvme0: failed to allocate host memory buffer.
Additional context:

No one was actively using Plex at 2:00 AM. The logs suggest it was performing a scan, which it has done multiple times over months without issues.

The Plex library is located on a Synology DS211j NAS, which isn’t very fast but has been sufficient for normal browsing and playback.

This setup has been running reliably for ~10 months prior to this incident.

Any thoughts or similar experiences would be greatly appreciated.

Thanks.

Statistics: Posted by DepstR — Fri Sep 12, 2025 9:47 pm



Viewing all articles
Browse latest Browse all 7512

Trending Articles