Thanks much for following up! Responses inline. On 2/11/22 4:15 PM, Genes Lists via arch-general wrote:
I suppose it could also be southbridge being annoying
It's a very new machine (Rocket Lake and PCIE4) so doesn't technically use the traditional northbridge/southbridge model. But point taken that this could be an issue with some component outside of the core CPU/RAM/PCIE assembly. My best guesses so far are that this is either an issue with the memory, or with the network chip. Haven't been able to confirm or refute either theory yet though.
did you check temps on the mobo are reasonable to be sure of adequate cooling? Are you overclocking at all by chance?
Temps don't appear to be an issue. I have a cron job that monitors temps and sends emails to root when it goes above around 80C. (Which I know works because I see those emails from time to time.) But there were no temp warning emails just before any of the times it's crashed. (3 or 4 times so far.)
sorry i know this kind of thing is frustrating to deal with.
Big time. I've gotten very good over the years at diagnosing and fixing issues using log messages. But sudden catastrophic crashes like this that don't leave any trace in the logs/journal are *really* hard to pin down. Thanks, DR