[arch-general] mce after linux-3.11.5-1 on NP900X3C
vixsomnis at fastmail.com
Fri Nov 28 15:51:45 UTC 2014
On November 28, 2014 9:08:16 AM EST, Rasmus Liland <jensrasmus at gmail.com> wrote:
>On 2014-11-19 18:16, Rasmus Liland wrote:
>> On 2014-11-17 00:19, Rasmus Liland wrote:
>> > On 2014-11-15 18:28, Mark Lee wrote:
>> > > On 11/15/2014 12:20 PM, Rasmus Liland wrote:
>> > > > On 2014-11-15 15:21, LoneVVolf wrote:
>> > > > > On 15-11-14 06:57, Rasmus Liland wrote:
>> > > > > > On 2014-11-15 06:10, Mark Lee wrote:
>> > > > > > > On 11/14/2014 10:29 PM, Rasmus Liland wrote:
>> > > > > > > > On 2014-11-15 04:01, Mark Lee wrote:
>> > > > > > > > > Are you booting with the new intel u-code?
>> > > > > > > > Are you fairly sure this is a Intel microcode issue?
>> > > > > > > I'm not completely certain; but it would make sense. I'd
>> > > > > > > it out.
>> > > > > > Thank you for your help thus far. I'll examine this further
>> > > > > > tomorrow, g'night.
>> > > > > From rasmus first post:
>> > > > > > I'm experiencing machine check exceptions since every
>> > > > > > after package linux-3.11.5-1 (Oct 14 2013)
>> > > > > New intel microcode was only introduced with kernel 3.17 ...
>> > > > > unlikely to have to do with this issue.
>> > > > >
>> > > > > install mcelog, run it as the log tells you and post the
>> > > > [ ... output, see previous messages ... ]
>> > > > I never did use the mcelog tool before, but to me it looks like
>> > > > much of an analysis, perhaps I'm doing it wrong.
>> > >Looks like a microcode error, please try to add the intel-ucode to
>> > >your kernel cmdline.
>> > Bah, just as I was finished enabling syslinux using
>> > and rebooted, the system did not respond, just a blank screen and
>> > shutting off, then rebooting again.
>> > Thus, this system needs an overhaul -- apparently some difficulty
>> > bootcode or the MBR, though I am able to mount the old partitions
>> > into them using arch-chroot.
>> > I tried installing grub using the standard method grub-install
>> > the wiki, with little success -- some good news at least relevant
>> > topic in this thread is that grub recognized and added the
>intel-ucode file I
>> > had copied to the /boot directory, when running grub-mkconfig.
>> > The plan forward is to forget about generating new mbr using gpart
>> > install Debian at the end of the disk to, hopefully, restore some
>> > related stuff that might have come crashing down after meddling
>> > syslinux.
>> A breakthrough in this thread has happened.
>> I ended up taking a backup of the disk to an external hdd using
>> > # dd if=/dev/sda of=/mnt/angrist-sda-18nov14.img
>> then I booted FreeBSD 10.1 memstick, entered shell and entered some
>> > # gpart delete -i 1 ada0
>> > # gpart delete -i 2 ada0
>> > # gpart delete -i 3 ada0
>> > # gpart destroy ada0
>> > # gpart create -s mbr ada0
>> > # gpart add -s 20g -t linux-data ada0
>> > # gpart add -t linux-data ada0
>> Then I rebooted into ArchLinux iso memstick to install Arch on the
>> partition and using the other one as /home. So now Syslinux works,
>> unfortunately I don't know why. And I was able to install all new
>> including linux 3.17.3-1 and intel-ucode 20140913-1, loading it in
>> according to the wiki.
>> I got a new mce after exactly three hours:
>> > [ snip ]
>> I am also making this output an attachment. There is a lot of more
>> information in this new mce compared to the other one I sent.
>> Perhaps some of you got some new suggestions.
>> Meanwhile, I am downgrading back to 3.11.5-1.
>It is dead.
>Yesterday, as I tried to suspend to ram using systemd on old working
>supending did not work completely.
>So I tried moving up to new kernel 3.17.something to see if things
>better there; as now I was more optimistic, since e.g. chrony were
>the rtc based on statistical methods and not only NTP protocol.
>Suspend to ram was able to complete with new kernel, and everything was
>for a while -- Until yesterday when I suspended on very low battery and
>that I think the battery went flat during suspend. This has not been a
>problem in the past, but when I tried to charge the laptop afterwards,
>charge LED did not light up even though the light on the charger said
>So, no power connection there, thus I guess most parts of the system
>still working as before, something related to the delivery of power is
>-- probably a capasitor of some sort or other things that wear out over
>I have little knowledge on this, but I guess if this was a desktop I
>probably swap the power supply unit for a fresh one.
>Honestly, I was hoping this laptop would last me at least four years of
>intensive everyday use, as the price tag was quite high.
>I am going to try to email the vendor to try to get a decent refund, as
>think Norwegean law permits a three-year-warranty on consumer
>matter what the Samsung company says.
>Rasmus Liland, jrl at jrl.dyndns.dk, jens.rasmus.liland at nmbu.no
Have you done memtests? It could also be a failing drive.
You should probably make a bootable mentest86 drive and run a full test.
More information about the arch-general