[arch-general] Troubleshooting random crash

Andre Goree andre at drenet.info
Wed Feb 6 19:06:45 EST 2013

On 02/06/13 16:29, Gaetan Bisson wrote:
> There are obvious gaps in your report; fixing them would be a good first
> step towards better understanding the problem. For instance:
> [2013-02-06 10:57:59 -0500] Andre Goree:
>> I believe this started happening after a recent update
>> but I can't know for sure and I can't really reproduce it...
> Give a window for when you started noticing the symptoms.
> See in /var/log/pacman.log what packages were upgraded then.
> Downgrade them and see if the issue persists.

As I said in the original mail:
"Also, again, I didn't start having issues until maybe 2 weeks ago"

Here is my pacman.log file from that time forward:

Not really too keen on downgrading a bunch of packages that might break
dependencies and provide a REAL mess.  If I have to go through that long
process, I'd rather just reinstall -- which at this point I'm planning
to do anyways.

>> Using another system, I'm able to
>> telnet to port 22 on the "frozen" box (I run ssh on this box) but cannot
>> get connected via ssh.
> What does "able to telnet to port 22" means? Do you get the SSH banner?
> If yes, when is the SSH connection hanged/interrupted (ssh -vvv)?
> What do the SSHD logs show on the server side?

That means, from another box on the network (my laptop in this
instance), I'm able to telnet to the hung/frozen desktop.  Yes I got the
SSH banner.  I tried 'ssh -v' when this happened earlier today, and it
hung after "Connecting to sideswipe-DT".  Next time I shall try -vvv.
Nothing is produced in the SSH logs on the desktop.  In fact it seems
all system processes hang because no logs are produced after the issue
rears it's ugly head.

>> I'm not actually using this box as
>> a production server, just as my main work desktop
> So you produce nothing at work?

Not sure if you're just being an ass or not, however if you aren't:
that has nothing at all to do with the issue and I merely wanted to
establish _why_ I was using btrfs on a machine that I have running at my
job -- which is _also_ inconsequential in the context of my email.  If
you indeed were being an ass, congrats, you succeeded.

>> Any tips on things I could set up to try to capture some sort of output
>> or perhaps a kernel dump (if it's the kernel crashing)?
> How about looking at the system logs to see what your system was up to
> just before a crash? 

I've done that, with no real hints.  That's the first thing any linux
admin does when confronted with an issue such as this, no?  Is there
perhaps a way to build Thunderbird with debug symbols or some kind of
logging?  I seem to recall opening Thunderbird each time this issue has
showed up.

I love Arch for what it is and I actually run it on the aforementioned
laptop (an Asus Zenbook) that I used to telnet.  It's a great OS and if
you know what you're doing you can minimize the hazards of running it on
a production machine.  It's been for the most part rock-solid in my
experience, which is why I'm perplexed by this current issue.  I'm ready
to blame btrfs b/c that's the only issue I see with my setup -- I also
have a tough time running a virtual machine on this box which I believe
is also due to btrfs.

Anyways, thanks for what help and guidance you did provide, it's

Andre Goree
andre at drenet.info

