[arch-general] Dovecot imap processes pinning CPU
In recent days, dovecot's "imap" processes keep getting stuck. Each time I check my server there's a bunch of "imap" processes (sometimes 2 of them, sometimes 4, sometimes 6) that are using all of the box's CPU. And worse, there's no way to kill the processes either (neither kill -15 or kill -9 works), which means that I wind up having to reboot the box every time this happens. REALLY irritating. I don't normally even *see* the imap processes in htop, as I think they're pretty short lived. And I'm not sure what they're looping trying to do. My debugging skills on Linux are a bit weak, and so I don't know how to look at the process and see what it's doing. Also, not sure what's changed on my system to cause this, as this is definitely a recent problem. Maybe the upgrade to Thunderbird 2.0.0.18 (since I usually use T'Bird to access my email). I don't seem to run into this problem when I use Squirrelmail. Anyone have any idea what the problem might be? Or, if not, then suggestions on how I might be able to debug the situation myself? TIA, DR
David Rosenstrauch wrote:
In recent days, dovecot's "imap" processes keep getting stuck. Each time I check my server there's a bunch of "imap" processes (sometimes 2 of them, sometimes 4, sometimes 6) that are using all of the box's CPU.
Interesting ... seems like the processes start to hang when I *close* Thunderbird. WTF?!?!?
Am Fri, 12 Dec 2008 21:34:56 -0500 schrieb David Rosenstrauch <darose@darose.net>:
David Rosenstrauch wrote:
In recent days, dovecot's "imap" processes keep getting stuck. Each time I check my server there's a bunch of "imap" processes (sometimes 2 of them, sometimes 4, sometimes 6) that are using all of the box's CPU.
Interesting ... seems like the processes start to hang when I *close* Thunderbird. WTF?!?!?
[root@server64 andyrtr]# ps aux | grep dovecot dovecot 826 0.0 0.0 12132 2220 ? S 01:33 0:00 imap-login root 3349 0.0 0.0 7116 688 ? Ss Dec01 0:16 /usr/sbin/dovecot root 3355 0.0 0.0 53924 2560 ? S Dec01 0:05 dovecot-auth dovecot 3393 0.0 0.0 12120 1920 ? S Dec01 0:00 pop3-login dovecot 3394 0.0 0.0 12120 1916 ? S Dec01 0:00 pop3-login dovecot 3395 0.0 0.0 12120 1916 ? S Dec01 0:00 pop3-login dovecot 8767 0.0 0.0 12132 1928 ? S 10:33 0:00 imap-login dovecot 8780 0.0 0.0 12132 1924 ? S 10:34 0:00 imap-login dovecot 10870 0.0 0.0 12132 1924 ? S 12:57 0:00 imap-login root 10871 0.0 0.0 53924 2520 ? S 12:57 0:00 dovecot-auth -w root 10952 0.0 0.0 9564 1004 pts/0 S+ 13:02 0:00 grep dovecot [root@server64 andyrtr]# ps aux | grep imap dovecot 826 0.0 0.0 12132 2220 ? S 01:33 0:00 imap-login dovecot 8767 0.0 0.0 12132 1928 ? S 10:33 0:00 imap-login dovecot 8780 0.0 0.0 12132 1924 ? S 10:34 0:00 imap-login dovecot 10870 0.0 0.0 12132 1924 ? S 12:57 0:00 imap-login andyrtr 10872 0.0 0.0 9232 1800 ? S 12:57 0:00 imap looks ok here using 2 claws-mail clients. maybe it's indeed Thunderbird. -Andy
Andreas Radke wrote:
looks ok here using 2 claws-mail clients. maybe it's indeed Thunderbird.
Not sure what's going on, but apparently I'm not the only person seeing this. A bunch of other people have seen the same thing. The dovecot guys are speculating that it's something in the kernel. There's a thread going on this, if anyone's interested: http://dovecot.org/list/dovecot/2008-December/035662.html DR
Le Saturday 13 December 2008 15:58:16 David Rosenstrauch, vous avez écrit :
Andreas Radke wrote:
looks ok here using 2 claws-mail clients. maybe it's indeed Thunderbird.
Not sure what's going on, but apparently I'm not the only person seeing this. A bunch of other people have seen the same thing. The dovecot guys are speculating that it's something in the kernel. There's a thread going on this, if anyone's interested: http://dovecot.org/list/dovecot/2008-December/035662.html
DR
Are you running thunderbird on the same machine which runs dovecot ? I tried to reproduce the bug whith my remote server using dovecot. But when I close thunderbird, the problem occurs on my local machine (kaddressbook takes 100% of the cpu and is not killed by kill -9), and everything is ok on the server. Very strange isn't it ?
Paul Ezvan wrote:
Are you running thunderbird on the same machine which runs dovecot ? I tried to reproduce the bug whith my remote server using dovecot. But when I close thunderbird, the problem occurs on my local machine (kaddressbook takes 100% of the cpu and is not killed by kill -9), and everything is ok on the server. Very strange isn't it ?
Yup, very strange indeed. I've not tried this with T'bird running on the server. Both times I saw this was with it running on remote machines (one Linux, one Windows). I'm pretty sure T'bird is somehow a factor (either directly or indirectly) as I'm not seeing this problem when I use Squirrelmail for the MUA. Would love to get this worked out, as I've had to downgrade my kernel (and, as a result, heimdal ... and, as a result, openssh) to avoid this problem. DR
Le Saturday 13 December 2008 18:55:14 David Rosenstrauch, vous avez écrit :
Paul Ezvan wrote:
Are you running thunderbird on the same machine which runs dovecot ? I tried to reproduce the bug whith my remote server using dovecot. But when I close thunderbird, the problem occurs on my local machine (kaddressbook takes 100% of the cpu and is not killed by kill -9), and everything is ok on the server. Very strange isn't it ?
Yup, very strange indeed.
I've not tried this with T'bird running on the server. Both times I saw this was with it running on remote machines (one Linux, one Windows). I'm pretty sure T'bird is somehow a factor (either directly or indirectly) as I'm not seeing this problem when I use Squirrelmail for the MUA.
Would love to get this worked out, as I've had to downgrade my kernel (and, as a result, heimdal ... and, as a result, openssh) to avoid this problem.
DR
I have just saw this message on the LKML : http://lkml.org/lkml/2008/12/13/95 It is the same behaviour, maybe it is related to our bug ? Paul Ezvan
Paul Ezvan wrote:
I have just saw this message on the LKML : http://lkml.org/lkml/2008/12/13/95 It is the same behaviour, maybe it is related to our bug ?
Sounds quite likely: a) it's the same kernel version as us, and b) it seems to involve inotify, which is where the dovecot people suspect the problem lies. DR
Hello David Rosenstrauch wrote:
In recent days, dovecot's "imap" processes keep getting stuck. Each time I check my server there's a bunch of "imap" processes (sometimes 2 of them, sometimes 4, sometimes 6) that are using all of the box's CPU.
And worse, there's no way to kill the processes either (neither kill -15 or kill -9 works), which means that I wind up having to reboot the box every time this happens. REALLY irritating.
I don't normally even *see* the imap processes in htop, as I think they're pretty short lived. And I'm not sure what they're looping trying to do. My debugging skills on Linux are a bit weak, and so I don't know how to look at the process and see what it's doing.
I see quite a few imap (and imap-login) processes hanging around, always have. But they are generally idling, not using 100% CPU like in your case.
Also, not sure what's changed on my system to cause this, as this is definitely a recent problem. Maybe the upgrade to Thunderbird 2.0.0.18 (since I usually use T'Bird to access my email). I don't seem to run into this problem when I use Squirrelmail.
Interesting. If this really is a fault then a bugreport must be sent to the dovecot developper with high priority (as this could lead to a DoS attack) and to the mozilla team indicating something's wrong with how they talk to certain IMAP servers.
Anyone have any idea what the problem might be? Or, if not, then suggestions on how I might be able to debug the situation myself?
Start here: http://www.dovecot.org/bugreport.html Glenn
participants (4)
-
Andreas Radke
-
David Rosenstrauch
-
Paul Ezvan
-
RedShift