Hi, On 14 April 2023 04:06:56 CEST, Luna Celeste <luna@unixpoet.dev> wrote:
On Wed, Apr 12, 2023 at 11:02:46 +0200, Shawn Michaels wrote:
On 11 April 2023 17:49:33 CEST, Luna Celeste <luna@unixpoet.dev> wrote:
it's been randomly hanging.
Also, things that may help you track this down: - monitor /proc/interrupts when it freezes
This is a 16 core processor and there's too much output on my 27" display to view it all at once; suggestions?
I would try to run something like this in the background: watch -n 1 "cat /proc/interrupts >> ~/watch.log && sync"
(I did not check that the command works as expected but you get the intention).
Once a crash is caught, analyze the produced logs. Perhaps you can monitor other files from sysfs/debugfs as well.
This is a good strategy, thank you! I'm a little worried about disk wear, though, but maybe that's just human bias?
If you're worried about that, you can store the logs on an external USB dongle. Or you could even remotely SSH into the box and use something like "script" in order to log the SSH session into a file on the remote machine. That way, you wouldn't need to sync every second.
Another thing that comes to mind: perhaps your system is still running, albeit very slow. I see that you're running libvirt. I've had a problem like this on my host: for more than a year, it would randomly and seldomly "freeze" (become astonishingly slow) when starting a VM (Windows guest with multiple passthroughs). I tried to debug this by increasing journald/kernel log levels but the issue appears to have vanished lately. I just assumed that it was fixed upstream, but perhaps it's still there.
Most of the time the VMs aren't actually running when the machine freezes / hangs; also, the last time it froze, the display was still active, and the clock hadn't advanced for something like 6-10 hours, matching the time when the mosh session lost its connection. So I don't think this is the cause.
This may still be the case. If you get e.g. a couple of system ticks every minute, you may not see a minute pass until a very long time.
Unrelated, would you please check your mail client? When you reply, I get a copy in my main inbox and in the folder for the mailing list, despite setting both Mail-Followup-To and Reply-To headers. Something seems to be acting strangely.
Sorry about that. I'm not used to mailing lists. I had a quick look through the settings and couldn't find anything related. Maybe this was caused because I replied to an "old" mail from the middle of the thread? I'm using k9 on Android. If somebody has an idea, don't hesitate to chime in.