[arch-general] Linux server crash causing router switch to stop working
Ralph Corderoy
ralph at inputplus.co.uk
Sat Feb 12 10:58:38 UTC 2022
Hi David,
Looking at your http://darose.net/ServerCrash20220209.png, are you aware
of https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt which has
detail on what it means? Though it looks to me like at least one line
of output has been trampled.
Also, one Google'd suggestion was a real-time program was running amok,
and you did start by saying the crash occurred in a Zoom call which
sounds real-time-ish. Perhaps keep logging what RT processes there are
to a file, followed by a sync(1) on the file, and see if something was
loading the machine before the crash. I *think* ‘rtprio’ shows what I'm
after.
ps axww -o pid,rtprio,pcpu,wchan,comm | awk '$2 != "-"'
--
Cheers, Ralph.
More information about the arch-general
mailing list