On Mon, Apr 10, 2023 at 21:59:24 +0200, Shawn Michaels wrote:
Hi,
On 9 April 2023 01:55:34 CEST, Luna Celeste <luna@unixpoet.dev> wrote:
it's been randomly hanging.
Long shot, but I had random freezes on a machine years ago and it turned out that I forgot to install the intel-ucode package [1]. EDIT: I realize now that you're running an AMD cpu.
Also, things that may help you track this down: - monitor /proc/interrupts when it freezes
This is a 16 core processor and there's too much output on my 27" display to view it all at once; suggestions?
- remove some hardware, see if crashes disappear, add back progressively
I tried that a few months ago; as I said in an earlier message, since I don't know what's causing the issue, and since right now it's not predictable, I don't have a timeframe on how much testing is "enough".
- increase kernel verbosity (I forgot how I did this, maybe [2] will help?)
Will look into this once the memtest86+ is finished; see below.
- when and how exactly did it start?
At least a few months ago; I'm not sure of the specifics, but I don't think it was after adding any particular bit of hardware. Of course, that doesn't rule out driver bugs.
Definitely worth checking memtest86+
Keep in mind that it may take days of stress testing memory before catching an error. I caught a faulty RAM slot on a MB once after 3 days of running memtest. I suggest running for an entire week.
I'm currently doing this as it's an easy troubleshooting step. So far I'm at pass 5 and have run for 13.5h with no errors. I'll keep folks updated after the full week has passed. On Tue, Apr 11, 2023 at 09:52:14 +0200, ihad wrote:
An AMD CPU that hangs when it's mostly idle rings a bell: Try disabling c6 and see if the problem goes away. Some older AMD CPUs had problems waking up from deep sleep. The script [1] can do it for you:
This is a 16 core AMD Ryzen 9 3950X; does that count as older? The machine is only a few years old. I understand that's quite a while in tech, but I want to be sure. -- Cheers, Luna Celeste