[arch-general] Linux server crash causing router switch to stop working

David Rosenstrauch darose at darose.net
Fri Feb 11 22:08:52 UTC 2022


Thanks much for following up!  Responses inline.


On 2/11/22 4:15 PM, Genes Lists via arch-general wrote:
> 
> I suppose it could also be southbridge being annoying

It's a very new machine (Rocket Lake and PCIE4) so doesn't technically 
use the traditional northbridge/southbridge model.  But point taken that 
this could be an issue with some component outside of the core 
CPU/RAM/PCIE assembly.


My best guesses so far are that this is either an issue with the memory, 
or with the network chip.  Haven't been able to confirm or refute either 
theory yet though.


> did you check 
> temps on the mobo are reasonable to be sure of adequate cooling? Are you 
> overclocking at all by chance?

Temps don't appear to be an issue.  I have a cron job that monitors 
temps and sends emails to root when it goes above around 80C.  (Which I 
know works because I see those emails from time to time.)  But there 
were no temp warning emails just before any of the times it's crashed. 
(3 or 4 times so far.)


> sorry i know this kind of thing is frustrating to deal with.

Big time.  I've gotten very good over the years at diagnosing and fixing 
issues using log messages.  But sudden catastrophic crashes like this 
that don't leave any trace in the logs/journal are *really* hard to pin 
down.

Thanks,

DR


More information about the arch-general mailing list