[arch-general] mce after linux-3.11.5-1 on NP900X3C

vixsomnis vixsomnis at fastmail.com
Fri Nov 28 15:51:45 UTC 2014


On November 28, 2014 9:08:16 AM EST, Rasmus Liland <jensrasmus at gmail.com> wrote:
>On 2014-11-19 18:16, Rasmus Liland wrote:
>> On 2014-11-17 00:19, Rasmus Liland wrote:
>> > On 2014-11-15 18:28, Mark Lee wrote:
>> > > On 11/15/2014 12:20 PM, Rasmus Liland wrote:
>> > > > On 2014-11-15 15:21, LoneVVolf wrote:
>> > > > > On 15-11-14 06:57, Rasmus Liland wrote:
>> > > > > > On 2014-11-15 06:10, Mark Lee wrote:
>> > > > > > > On 11/14/2014 10:29 PM, Rasmus Liland wrote:
>> > > > > > > > On 2014-11-15 04:01, Mark Lee wrote:
>> > > > > > > > > Are you booting with the new intel u-code?
>> > > > > > > > Are you fairly sure this is a Intel microcode issue?
>> > > > > > > I'm not completely certain; but it would make sense. I'd
>test
>> > > > > > > it out.
>> > > > > > Thank you for your help thus far. I'll examine this further
>> > > > > > tomorrow, g'night.
>> > > > > From rasmus first post:
>> > > > > > I'm experiencing machine check exceptions since every
>kernel
>> > > > > > after package linux-3.11.5-1 (Oct 14 2013)
>> > > > > New intel microcode was only introduced with kernel 3.17 ...
>It's
>> > > > > unlikely to have to do with this issue.
>> > > > > 
>> > > > > install mcelog, run it as the log tells you and post the
>result.
>> > > > [ ... output, see previous messages ... ]
>> > > > I never did use the mcelog tool before, but to me it looks like
>not
>> > > > much of an analysis, perhaps I'm doing it wrong.
>> > >Looks like a microcode error, please try to add the intel-ucode to
>> > >your kernel cmdline.
>> > Bah, just as I was finished enabling syslinux using
>syslinux-install_update
>> > and rebooted, the system did not respond, just a blank screen and
>lighting
>> > shutting off, then rebooting again. 
>> > 
>> > Thus, this system needs an overhaul -- apparently some difficulty
>with the
>> > bootcode or the MBR, though I am able to mount the old partitions
>and chroot
>> > into them using arch-chroot. 
>> > 
>> > I tried installing grub using the standard method grub-install
>according to
>> > the wiki, with little success -- some good news at least relevant
>to previous
>> > topic in this thread is that grub recognized and added the
>intel-ucode file I
>> > had copied to the /boot directory, when running grub-mkconfig.
>> > 
>> > The plan forward is to forget about generating new mbr using gpart
>and
>> > install Debian at the end of the disk to, hopefully, restore some
>boot
>> > related stuff that might have come crashing down after meddling
>with
>> > syslinux.
>> 
>> A breakthrough in this thread has happened. 
>> 
>> I ended up taking a backup of the disk to an external hdd using
>> 
>> > # dd if=/dev/sda of=/mnt/angrist-sda-18nov14.img
>> 
>> then I booted FreeBSD 10.1 memstick, entered shell and entered some
>commands:
>> 
>> > # gpart delete -i 1 ada0
>> > # gpart delete -i 2 ada0
>> > # gpart delete -i 3 ada0
>> > # gpart destroy ada0
>> > # gpart create -s mbr ada0
>> > # gpart add -s 20g -t linux-data ada0
>> > # gpart add -t linux-data ada0
>> 
>> Then I rebooted into ArchLinux iso memstick to install Arch on the
>20G
>> partition and using the other one as /home. So now Syslinux works,
>> unfortunately I don't know why. And I was able to install all new
>packages
>> including linux 3.17.3-1 and intel-ucode 20140913-1, loading it in
>Syslinux
>> according to the wiki.
>> 
>> I got a new mce after exactly three hours: 
>> 
>> > [ snip ]
>> 
>> I am also making this output an attachment. There is a lot of more
>> information in this new mce compared to the other one I sent.
>> 
>> Perhaps some of you got some new suggestions.
>> 
>> Meanwhile, I am downgrading back to 3.11.5-1.
>
>It is dead.
>
>Yesterday, as I tried to suspend to ram using systemd on old working
>kernel,
>supending did not work completely.
>
>So I tried moving up to new kernel 3.17.something to see if things
>worked out
>better there; as now I was more optimistic, since e.g. chrony were
>syncing
>the rtc based on statistical methods and not only NTP protocol.
>
>Suspend to ram was able to complete with new kernel, and everything was
>good
>for a while -- Until yesterday when I suspended on very low battery and
>after
>that I think the battery went flat during suspend. This has not been a
>problem in the past, but when I tried to charge the laptop afterwards,
>the
>charge LED did not light up even though the light on the charger said
>it was
>active.
>
>So, no power connection there, thus I guess most parts of the system
>are
>still working as before, something related to the delivery of power is
>broken
>-- probably a capasitor of some sort or other things that wear out over
>time,
>I have little knowledge on this, but I guess if this was a desktop I
>would
>probably swap the power supply unit for a fresh one.
>
>Honestly, I was hoping this laptop would last me at least four years of
>intensive everyday use, as the price tag was quite high.
>
>I am going to try to email the vendor to try to get a decent refund, as
>I
>think Norwegean law permits a three-year-warranty on consumer
>electronics, no
>matter what the Samsung company says.
>
>-- 
>Rasmus Liland, jrl at jrl.dyndns.dk, jens.rasmus.liland at nmbu.no 

Have you done memtests? It could also be a failing drive.

You should probably make a bootable mentest86 drive and run a full test.
--
vixsomnis


More information about the arch-general mailing list