[arch-general] Troubleshooting random crash
andre at drenet.info
Wed Feb 6 22:49:19 EST 2013
On 02/06/13 22:45, Curtis Shimamoto wrote:
> Andre Goree <andre at drenet.info> wrote:
>> On 02/06/13 20:14, Gaetan Bisson wrote:
>>> [2013-02-06 19:06:45 -0500] Andre Goree:
>>>> Not really too keen on downgrading a bunch of packages that might
>>>> dependencies and provide a REAL mess. If I have to go through that
>>>> process, I'd rather just reinstall -- which at this point I'm
>>>> to do anyways.
>>> Well, there is little point in posting to this list if you have no
>>> motivation to actually investigate the problem.
>>> For starters, you've upgraded Linux from 3.6.11 to 3.7.4 in the
>>> when you report the issue appeared; from the symptoms you described,
>>> it's a likely suspect. Downgrading it is far from being a "REAL
>>> you only need to downgrade/rebuild the external modules you really
>>> (probably none).
>> Indeed there isn't, and surely even less point in replying to said post
>> if in fact I had no motivation. Given that I'm replying, I'd probably
>> like to avoid reinstalling if at all possible. I like the idea of
>> downgrading just the kernel -- obviously I mean downgrading every
>> package I've upgraded since 1/21 was not something I wanted to
>> undertake. I'll try this tomorrow.
>>>> In fact it seems
>>>> all system processes hang because no logs are produced after the
>>>> rears it's ugly head.
>>> Ah. So that would mean your issue is I/O related, then?
>> It would seem so, yes. I hinted to this at the end of my last reply as
>>>>> So you produce nothing at work?
>>>> Not sure if you're just being an ass or not, however if you aren't:
>>>> that has nothing at all to do with the issue and I merely wanted to
>>>> establish _why_ I was using btrfs on a machine that I have running
>> at my
>>>> job -- which is _also_ inconsequential in the context of my email.
>>>> you indeed were being an ass, congrats, you succeeded.
>>> Once you were done being offended, you could have looked for the
>>> behind the words I used: that your "main work desktop" really
>>> as "a production server".
>>> But, of course, as you have so unequivocally declared, btrfs has
>>> absolutely "nothing at all to do with the issue". And your statement
>>> above implying that the problem is I/O related is just a coincidence.
>> I think you mis-comprehended my reply. Following the context, I merely
>> meant that distinguishing my system from a production server and
>> explaining why I was running btrfs on this system was inconsequential
>> the issue at hand. Which is still true. I never said nor meant it to
>> be understood that I believed btrfs not to be the problem. In fact,
>> opposite is true.
>> So, for the sake of clarity, I never declared (and certainly not
>> unequivocally) "btrfs has absolutely nothing at all to do with the
>> issue", but rather, my distinctions and reasons for running btrfs have
>> nothing to do with the issue. Not sure how you got that mixed up,
>> especially given the later part of my reply.
>>> Reporting issues is worthless when speculation is substituted for
>>> data. For example, a good report would have gone: "I believe this
>>> is unrelated to btrfs being my root filesystem since, on another Arch
>>> machine running ext3, I observe the following identical symptoms:
>>> `ssh -vvv` hangs at exactly the same point; second..."
>> I'll be sure to raise my reporting standards the next time I'd like
>>from an Arch list, my apologies.
>>>>> How about looking at the system logs to see what your system was up
>>>>> just before a crash?
>>>> I've done that, with no real hints. That's the first thing any
>>>> admin does when confronted with an issue such as this, no?
>>> Sure. But your first post gave no indication that you did that.
>> Indeed, I need to raise my reporting standards, I figured a lot of
>> was implied but I now know I must be much clearer. Again, my
>>>> Is there
>>>> perhaps a way to build Thunderbird with debug symbols or some kind
>>>> logging? I seem to recall opening Thunderbird each time this issue
>>>> showed up.
>>> Well it would be nice to confirm that it is indeed at fault;
>>> it is certainly not a "REAL mess" either. You can certainly also
>>> it with debug symbols: in the PKGBUILD (or makepkg.conf), set
>>> CXXFLAGS='' LDFLAGS='' CFLAGS='-g' and remove the strip option.
>> Given that thunderbird wasn't upgraded in the time that this issue
>> began, not sure a downgrade would help but it may be worth a shot.
>> Thanks for the pointers on building with debug symbols.
>>>> I'm ready
>>>> to blame btrfs b/c that's the only issue I see with my setup -- I
>>>> have a tough time running a virtual machine on this box which I
>>>> is also due to btrfs.
>>> Didn't you write just a few lines ago that btrfs "has nothing at all
>>> do with the issue"?
>> I most certainly did not, there's an obvious misunderstanding here.
>>> Wild guess: your thunderbird mail database is huge (just like the
>>> image of your virtual machine - although I cannot really know what
>>> mean by "tough time") and your btrfs has problems dealing with such
>>> files (for instance, because your filesystem nearly full). To
>>> start thunderbird with an empty profile (such as by renaming
>>> into ~/.mozilla.old) and see what happens.
>> The thing is, this doesn't happen everytime I start thunderbird --
>> rather, seemingly, after the system has been up for a long period (>20
>> hrs or so). The filesystem is not nearly full either, though it does
>> contain a lot of data. I'm thinking downgrading to 3.6.x will help a
>> bit. I'm going to look for btrfs bugs in 3.7.x and see if anyone else
>> has been having a similar issue as well. Thanks for the assistance.
> For downgrading, I have found the Arch Rollback Machine quite handy. You can choose a date to roll back to, sync and then have pacman reinstall all packages that it finds are newer than its database. This is of course if there was not a significant change like potentially the recent filesyatem update.
> I would certainly try doing just the kernel first though, as that is even easier. Just thought I would mention that amazing tool we have in our debugging arsenal.
Awesome, thanks for the suggestion!
andre at drenet.info
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 551 bytes
Desc: OpenPGP digital signature
More information about the arch-general