[arch-general] Help diagnosing kworker 'bug'

Anatol Pomozov anatol.pomozov at gmail.com
Fri Aug 8 23:18:40 EDT 2014


Hi

On Fri, Aug 8, 2014 at 6:58 PM, Oon-Ee Ng <ngoonee.talk at gmail.com> wrote:
> On Wed, Aug 6, 2014 at 3:43 PM, Oon-Ee Ng <ngoonee.talk at gmail.com> wrote:
>> On Wed, Aug 6, 2014 at 10:40 AM, Anatol Pomozov
>> <anatol.pomozov at gmail.com> wrote:
>>> 'perf' is a great and very powerful tool that allow to debug problems
>>> like this. Run '# perf top -g -p $PID' and it will show where the
>>> process spends *cpu cycles*. It should be enough to understand what
>>> kworker thread does. For all curious minds I highly recommend to read
>>> this tutorial https://perf.wiki.kernel.org/index.php/Tutorial
>>
>> Thanks, if my boy gets to sleep early tonight I'll do that.
>
> Having tried that out, I don't really understand the output. It seems
> the first column is CPU usage and the second is...? IO?
>
> Anyway these are the top 3 things in my output after a short amount of
> time. Other things which are low in CPU usage and high in the second
> column are find_next_zero_bit and _raw_spin_lock. Not sure what I
> should glean from this.
> +   17.74%     0.10%  [kernel]          [k] __filemap_fdatawrite_range
> +   15.04%     0.02%  [kernel]          [k] filemap_fdatawrite_range
> +    9.93%     9.93%  [kernel]          [k] find_next_bit


The first column is CPU usage, not sure about the second column. Click
on [+] symbol and it will show you the full call graph for this
function. So it will let you understand what subsystem calls it. If it
is btrfs then please contact upstream
http://vger.kernel.org/vger-lists.html#linux-btrfs

Another way to get more information about this problem is to use
kernel traces. Let's enable block and writeback events:

sudo su
cd /sys/kernel/debug/tracing
echo 1 > events/writeback/enable
echo 1 > events/block/enable
echo 1 > tracing_on
cat trace_pipe

there will some information like processid, inode... Maybe you'll see
some pattern in the writes etc..

In any case this problem sounds like an upstream issue and it is
better to contact them.


More information about the arch-general mailing list