[arch-general] i3wm (randomly?) freezes; SIGCONT seems to fix it
Hi all, I've had a very elusive and frustrating problem this week and don't know where to look anymore. Maybe one of you has an idea? :) Since 2016-09-04, i3wm freezes every once and a while; often after waking the screen (even if it wasn't locked), a few times also directly or a few minutes after logging in. I have not yet found a conclusive pattern. The freeze is complete; no reaction to mouse or keyboard activity, not even in the debug logs. However, I can interact with the focused window just fine (I obviously cannot change focus). i3 also doesn't react to IPC calls (tried with i3-msg). Up to now, I had sent SIGTERM to important programs that had been running and KILLed i3 (it doesn't react to SIGTERM, either). Today, in my frustration, I tried sending SIGCONT instead. To my surprise, i3 immediately sprung back to life. What I do not understand is that i3 hasn't been updated in a while, let alone since Sunday. FWIW, I am running the fork i3-gaps instead of i3, but that is very close to upstream and has been stable forever. I posted /var/log/pacman.log from the day when it started on my server [0]. Does anyone have any clue what could be causing this? Thanks, Bennett [0]: https://vps1.piater.name/commie/#dvm1K6lg -- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
On Fri, Sep 9, 2016 at 4:11 PM, Bennett Piater <bennett@piater.name> wrote:
Hi all, I've had a very elusive and frustrating problem this week and don't know where to look anymore. Maybe one of you has an idea? :)
Since 2016-09-04, i3wm freezes every once and a while; often after waking the screen (even if it wasn't locked), a few times also directly or a few minutes after logging in. I have not yet found a conclusive pattern.
The freeze is complete; no reaction to mouse or keyboard activity, not even in the debug logs. However, I can interact with the focused window just fine (I obviously cannot change focus). i3 also doesn't react to IPC calls (tried with i3-msg).
Up to now, I had sent SIGTERM to important programs that had been running and KILLed i3 (it doesn't react to SIGTERM, either). Today, in my frustration, I tried sending SIGCONT instead. To my surprise, i3 immediately sprung back to life.
What I do not understand is that i3 hasn't been updated in a while, let alone since Sunday. FWIW, I am running the fork i3-gaps instead of i3, but that is very close to upstream and has been stable forever.
I posted /var/log/pacman.log from the day when it started on my server [0].
Does anyone have any clue what could be causing this?
Thanks, Bennett
[0]: https://vps1.piater.name/commie/#dvm1K6lg
-- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
I have had two "i3 complete freezes" recently (I think I managed to trace them back to Sept 1st and Sept 5th). None were after an hibernation wake-up nor anything. Just using Chromium as usual then suddenly mouse slows down and everything freezes with no control on it. At the time I thought Chromium was the culprit. Had to reboot the machine throught the power switch. I did not try to investigate further. After a quick look at my journalctl, I can't really see anything suspicious. FYI, I am using vanilla i3 from community with i3pystatus.
(I think I managed to trace them back to Sept 1st and Sept 5th).
My last update before the apparently breaking one was 2016-08-24. So, maybe some upgrade between 2016-08-24 and 2016-09-04 doesn't play well with i3... Would you mind cross-checking your upgrades between 2016-08-24 and 2016-09-01 with my log [0] when you have the time?
After a quick look at my journalctl, I can't really see anything suspicious.
i3wm doesn't log to systemd. I produce my logs by starting i3 with exec i3 --shmlog-size=26214400 -V -d all > ~/.i3/logs/$(date -Iseconds).log in my ~/.xinitrc. However, they don't produce any useful information AFAICT, even with debug verbosity. Is it possible that i3 receives SIGSTOP for some reason? Merci beaucoup, Bennett [0]: https://vps1.piater.name/commie/#dvm1K6lg -- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
On Fri, 9 Sep 2016 16:11:48 +0200, Bennett Piater wrote:
I've had a very elusive and frustrating problem this week and don't know where to look anymore.
Where did you already look? $ less $HOME/.xsession-errors ? Regards, Ralf
I do not have that file. Cheers, Bennett On 09/09/2016 05:17 PM, Ralf Mardorf wrote:
On Fri, 9 Sep 2016 16:11:48 +0200, Bennett Piater wrote:
I've had a very elusive and frustrating problem this week and don't know where to look anymore.
Where did you already look?
$ less $HOME/.xsession-errors
?
Regards, Ralf
-- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
On Fri, 9 Sep 2016 17:19:33 +0200, Bennett Piater wrote:
On 09/09/2016 05:17 PM, Ralf Mardorf wrote:
On Fri, 9 Sep 2016 16:11:48 +0200, Bennett Piater wrote:
I've had a very elusive and frustrating problem this week and don't know where to look anymore.
Where did you already look?
$ less $HOME/.xsession-errors
? I do not have that file.
Regarding https://www.google.de/?gws_rd=ssl#q=+linux+no+.xsession-errors https://bbs.archlinux.org/viewtopic.php?id=143068 http://www.linuxquestions.org/questions/slackware-14/where-is-~-xsession-err... $ startx 2> ~/.xsession-errors should do the job.
Regarding
https://www.google.de/?gws_rd=ssl#q=+linux+no+.xsession-errors https://bbs.archlinux.org/viewtopic.php?id=143068 http://www.linuxquestions.org/questions/slackware-14/where-is-~-xsession-err...
$ startx 2> ~/.xsession-errors
should do the job.
Thank you, I will change that in my .profile and see if that contains useful information. However, if sending SIGCONT to i3 works the next time this happens, I don't think that X has anything to do with it... Cheers, Bennett -- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
On Fri, 9 Sep 2016 17:32:40 +0200, Bennett Piater wrote:
Regarding
https://www.google.de/?gws_rd=ssl#q=+linux+no+.xsession-errors https://bbs.archlinux.org/viewtopic.php?id=143068 http://www.linuxquestions.org/questions/slackware-14/where-is-~-xsession-err...
$ startx 2> ~/.xsession-errors
should do the job.
Thank you, I will change that in my .profile and see if that contains useful information.
However, if sending SIGCONT to i3 works the next time this happens, I don't think that X has anything to do with it...
.xsession-errors contains the output of every GUI app you are running, as if you would launch all those apps in terminals.
On Fri, Sep 9, 2016 at 5:38 PM, Ralf Mardorf <silver.bullet@zoho.com> wrote:
On Fri, 9 Sep 2016 17:32:40 +0200, Bennett Piater wrote:
Regarding
https://www.google.de/?gws_rd=ssl#q=+linux+no+.xsession-errors https://bbs.archlinux.org/viewtopic.php?id=143068 http://www.linuxquestions.org/questions/slackware-14/where-is-~-xsession-err...
$ startx 2> ~/.xsession-errors
should do the job.
Thank you, I will change that in my .profile and see if that contains useful information.
However, if sending SIGCONT to i3 works the next time this happens, I don't think that X has anything to do with it...
.xsession-errors contains the output of every GUI app you are running, as if you would launch all those apps in terminals.
Comparing your pacman.log with mine (mine taken from 2016-08-23 to 2016-09-05), here is the list of common packages we both have either installed or upgraded: - man-db - mariadb - mariadb-clients - mediainfo - nano - networkmanager - openvpn - pacman-mirrorlist - python2-appdirs - python2-setuptools - python-appdirs - python-setuptools - webkit2gtk - xdotool Guillaume
Comparing your pacman.log with mine (mine taken from 2016-08-23 to 2016-09-05), here is the list of common packages we both have either installed or upgraded:
- man-db - mariadb - mariadb-clients - mediainfo - nano - networkmanager - openvpn - pacman-mirrorlist - python2-appdirs - python2-setuptools - python-appdirs - python-setuptools - webkit2gtk - xdotool
Merci beaucoup! :) Interestingly, none of these packages look like they could cause this - I could maybe imagine xdotool, but anything else... Did the freeze happen again for you, or has it stopped? Thanks, Bennett -- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
On Fri, Sep 9, 2016 at 10:16 PM, Bennett Piater <bennett@piater.name> wrote:
Comparing your pacman.log with mine (mine taken from 2016-08-23 to 2016-09-05), here is the list of common packages we both have either installed or upgraded:
- man-db - mariadb - mariadb-clients - mediainfo - nano - networkmanager - openvpn - pacman-mirrorlist - python2-appdirs - python2-setuptools - python-appdirs - python-setuptools - webkit2gtk - xdotool
Merci beaucoup! :)
Interestingly, none of these packages look like they could cause this - I could maybe imagine xdotool, but anything else... Did the freeze happen again for you, or has it stopped?
Thanks, Bennett
-- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
No freeze since the one I told you about! I will definitely investigate further if/when I get a new one.
No freeze since the one I told you about! I will definitely investigate further if/when I get a new one.
Very weird... I had another one yesterday, but again, no clue what caused it. $(killall -CONT i3) fixed it again, so be sure to try that if you get another freeze :) Sadly, the logs contained zero indication, so I don't know how to proceed. Thanks, Bennett -- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
Just as an aside from a technical perspective: Keep SIG_IGN vs SIG_DFL in mind. I guess signals are passed from the child process to the parent? Then again, I don't know where setsid() comes into play, whether i3 does it / does it correctly, and whether that has an effect on the signal processing queue. SIGCHEERS! mar77i [0] http://man7.org/linux/man-pages/man2/signal.2.html
On Sun, 11 Sep 2016 20:51:03 +0200, Bennett Piater wrote:
No freeze since the one I told you about! I will definitely investigate further if/when I get a new one.
Very weird... I had another one yesterday, but again, no clue what caused it. $(killall -CONT i3) fixed it again, so be sure to try that if you get another freeze :)
Sadly, the logs contained zero indication, so I don't know how to proceed.
Seemingly it is a software issue, since somebody else mentioned to experience the same after upgrading, but you never know. Just in case you could check the battery and run memtest.
On Sun, Sep 11, 2016 at 8:57 PM, Ralf Mardorf <silver.bullet@zoho.com> wrote:
On Sun, 11 Sep 2016 20:51:03 +0200, Bennett Piater wrote:
No freeze since the one I told you about! I will definitely investigate further if/when I get a new one.
Very weird... I had another one yesterday, but again, no clue what caused it. $(killall -CONT i3) fixed it again, so be sure to try that if you get another freeze :)
Sadly, the logs contained zero indication, so I don't know how to proceed.
Seemingly it is a software issue, since somebody else mentioned to experience the same after upgrading, but you never know. Just in case you could check the battery and run memtest.
I have experienced yet another freeze this afternoon. Nothing in any log helps me diagnose. I could barely run a htop which seemed to show memory exhaustion. I am keeping an eye on memory consumption from now on.
Oh, this stuff might all be related to the issue: [0], [1], [2] cheers! mar77i [0] https://faq.i3wm.org/question/4631/dont-sigstop-when-in-hide-mode/here.html [1] https://github.com/ultrabug/py3status/issues/253 [2] https://github.com/i3/i3/issues/2280
Oh, that looks promising. I'll read through those issues tomorrow. Thank you! :) Cheers, Bennett On 09/11/2016 09:08 PM, Martin Kühne via arch-general wrote:
Oh, this stuff might all be related to the issue: [0], [1], [2]
cheers! mar77i
[0] https://faq.i3wm.org/question/4631/dont-sigstop-when-in-hide-mode/here.html [1] https://github.com/ultrabug/py3status/issues/253 [2] https://github.com/i3/i3/issues/2280
-- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
On Sun, Sep 11, 2016 at 9:13 PM, Bennett Piater <bennett@piater.name> wrote:
Oh, that looks promising. I'll read through those issues tomorrow. Thank you! :)
Cheers, Bennett
On 09/11/2016 09:08 PM, Martin Kühne via arch-general wrote:
Oh, this stuff might all be related to the issue: [0], [1], [2]
cheers! mar77i
[0] https://faq.i3wm.org/question/4631/dont-sigstop-when-in-hide-mode/here.html [1] https://github.com/ultrabug/py3status/issues/253 [2] https://github.com/i3/i3/issues/2280
-- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
New freeze happened to me yesterday. Before the computer got completely frozen, I managed to have a look at `htop`, `vmsize` and `free`. All my 8Go of RAM were full. 84Go of shared memory were showed as used by `/usr/lib/chromium/nacl_helper`. This looks like a known issue with Chromium [0]. I did what seems to be generally advised: use a fresh ~/.config/chromium directory. If this fixes it for me, this would imply I have a different issue from OP, I would then not follow on this thread. Otherwise I will have to dig these i3 links and let you know. Thanks for the i3 issue links :) [0] https://www.reddit.com/r/chrome/comments/498br6/help_why_is_nacl_helper_84g_... Guillaume
.xsession-errors contains the output of every GUI app you are running, as if you would launch all those apps in terminals.
That's good to know, thank you :) I'm very curious as to what I will (or won't) find next time the freeze happens. It hasn't happened this boot, ever since I sent SIGCONT. I will reboot tomorrow to find out if that "resets" the problem. Cheers, Bennett -- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
Am 09.09.2016 um 22:00 schrieb Bennett Piater:
.xsession-errors contains the output of every GUI app you are running, as if you would launch all those apps in terminals.
That's good to know, thank you :)
I'm very curious as to what I will (or won't) find next time the freeze happens. It hasn't happened this boot, ever since I sent SIGCONT. I will reboot tomorrow to find out if that "resets" the problem.
Cheers, Bennett
Hi all, I had two random freezes too. One after a screenlock and the second one during using Chromium. The second freeze wasn't a complete one. Mouse moves were delayed approx. 30s and were very stuttering. I only could do a hard reset. I noticed, that despite my memory isn't used completly (3GB/4GB) 50% of my swap space are used. Further I see chromium uses a lot of swap space (approx. 450MB). I don't know if this is an indication of the problem. I never checked the usage of swap before. My setup: i3 4.12 py3status 3.1rc0 Chromium 53 xautolock 2.2 Regards Ludwig -- GPG fingerprint: 988C DE90 4149 C4A8 AAF1 996A 0450 F6B3 A9B2 6230
On 09/17/2016 07:33 AM, Ludwig Zins wrote:
Hi all,
I had two random freezes too. One after a screenlock and the second one during using Chromium. The second freeze wasn't a complete one. Mouse moves were delayed approx. 30s and were very stuttering. I only could do a hard reset.
I noticed, that despite my memory isn't used completly (3GB/4GB) 50% of my swap space are used. Further I see chromium uses a lot of swap space (approx. 450MB). I don't know if this is an indication of the problem. I never checked the usage of swap before.
My setup: i3 4.12 py3status 3.1rc0 Chromium 53 xautolock 2.2
Regards Ludwig
I've had 2 lockups during the last week, too, but I wasn't on the machine when they happened. Not sure if it's related, but in my case, it was a desktop machine and the whole system froze (sshd wasn't answering, either). I'm not seing any indication of what's happening in journalctl, other than journalctl stops recording anything at a certain time. I've also see the increased swap usage over the last few months. I went through the swap page at https://wiki.archlinux.org/index.php/Swap and https://rudd-o.com/linux-and-free-software/tales-from-responsivenessland-why... and saw no change. Every morning I come down to something like this: $ free -m total used free shared buff/cache available Mem: 15998 3345 421 240 12230 12080 Swap: 16383 1374 15009 And all my apps appear to be swapped out (Chromium, Thunderbird, even a locked KeePassX). I do run backups overnight and imagine that the buff/cache is filling up with directories and pushing out the actual apps. I'm thinking about disabling swap at this point, since I don't really do anything with the box that requires swap space, or approaches 16G of ram. Dave
(Sorry if I sent this mail twice, I'm having troubles with my mail client) I've experienced locks too, nothing in logs that I could find, but after swaping back to the main kernel, from linux-ck, everything was fine. Maybe this is related for you too? The update that caused the lockups seems to match http://ck-hack.blogspot.fr/2016/09/bfs-497-linux-47-ck4.html
Alas I still have no explanation for the random lockups some people are seeing, but I have seen reports of it happening on mainline kernels as well now, so while I'm always suspicious of my own code, there is also the chance that BFS exacerbates an issue in mainline. Something that appears common is onboard Intel graphics with the Haswell chipset.
Am 09.09.2016 um 22:00 schrieb Bennett Piater:
.xsession-errors contains the output of every GUI app you are running, as if you would launch all those apps in terminals.
That's good to know, thank you :)
I'm very curious as to what I will (or won't) find next time the freeze happens. It hasn't happened this boot, ever since I sent SIGCONT. I will reboot tomorrow to find out if that "resets" the problem.
Cheers, Bennett
I could solve the problem so far. I switched to google-chrome-beta. Since then my swap is 0% and I hadn't any freeze. It seems the poblem was/is caused by chromium. Ludwig -- GPG fingerprint: 988C DE90 4149 C4A8 AAF1 996A 0450 F6B3 A9B2 6230
participants (7)
-
Bennett Piater
-
cyelae
-
David N Murray
-
Guillaume ALAUX
-
Ludwig Zins
-
Martin Kühne
-
Ralf Mardorf