[arch-general] Linux server crash causing router switch to stop working

Ralph Corderoy ralph at inputplus.co.uk
Sat Feb 12 10:58:38 UTC 2022


Hi David,

Looking at your http://darose.net/ServerCrash20220209.png, are you aware
of https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt which has
detail on what it means?  Though it looks to me like at least one line
of output has been trampled.

Also, one Google'd suggestion was a real-time program was running amok,
and you did start by saying the crash occurred in a Zoom call which
sounds real-time-ish.  Perhaps keep logging what RT processes there are
to a file, followed by a sync(1) on the file, and see if something was
loading the machine before the crash.  I *think* ‘rtprio’ shows what I'm
after.

    ps axww -o pid,rtprio,pcpu,wchan,comm | awk '$2 != "-"'

-- 
Cheers, Ralph.


More information about the arch-general mailing list