[arch-general] High CPU on one core, but unable to find process responsible
Carsten Mattner
carstenmattner at gmail.com
Mon Mar 12 02:59:10 UTC 2018
On 3/12/18, David Rosenstrauch <darose at darose.net> wrote:
> My server's been exhibiting some very strange behavior lately. Every
> couple of days I run into a situation where one core (core #0) on the
> quad core CPU starts continuously using around 34% of CPU, but I'm not
> able to see (using htop) any process that's responsible for using all
> that CPU. Even when I tell htop to show me kernel threads too, I still
> am not able to see the offending process. Every process remains under
> 1% CPU usage (except for occasional, small, short-lived spikes up) yet
> the CPU usage on that core remains permanently hovering at around 34%.
> The problem goes away when I reboot, but then comes back with a day or
> so.
My gut feeling is that one of the kernel worker threads hangs.
So that would be 25% overall and 100% of the affected core.
But you say there's no load to be found in the kernel threads,
which is odd.
Or if the server is accessible from the Internet, is it possible
it's rooted and someone's running a hidden process? To confirm
this isn't the case, cut off Internet access and let it run for
two days.
I don't think there are any official hidden processes that do not
show up in htop or top since that would make them seem like rootkits.
That means if the guilty process is really invisible, then it's
definitely unusual.
It's scary to consider a rootkit, but if that's the case, then
it's best to be aware as soon as possible. I hope this is not
case for you, wouldn't wish it on your worst enemy.
Another idea. Can you limit the cores to 1 or maybe two and see
if it becomes easier to pinpoint?
This might work in the booted system:
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online
But on the kernel command line maxcpus=1 should work.
More information about the arch-general
mailing list