12 Feb
2022
12 Feb
'22
10:58 a.m.
Hi David, Looking at your http://darose.net/ServerCrash20220209.png, are you aware of https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt which has detail on what it means? Though it looks to me like at least one line of output has been trampled. Also, one Google'd suggestion was a real-time program was running amok, and you did start by saying the crash occurred in a Zoom call which sounds real-time-ish. Perhaps keep logging what RT processes there are to a file, followed by a sync(1) on the file, and see if something was loading the machine before the crash. I *think* ‘rtprio’ shows what I'm after. ps axww -o pid,rtprio,pcpu,wchan,comm | awk '$2 != "-"' -- Cheers, Ralph.