[arch-general] journalctl and I/O errors

Leonid Isaev lisaev at umail.iu.edu
Thu Jan 30 12:35:33 EST 2014


On Thu, 30 Jan 2014 01:11:54 -0500
Janna Martl <janna.martl109 at gmail.com> wrote:

> A couple months ago, I started getting I/O errors (see below) whenever
> I tried to do journalctl -n X for sufficiently large X (and journalctl
> would segfault). I assumed my hard drive was going to die, but in the
> mean time this was annoying so I did:
> 
> # cp -rp /var/log/journal{,-clone}
> # rm -rf /var/log/journal
> # mv /var/log/journal{-clone,}
> 
> There were a few errors after step 1 but this made the segfaulting
> stop. A few weeks later, same problem, same "fix". A few weeks later,
> ditto. In all this time, the only I/O errors I've ever seen have been
> related to accessing the journal. The SMART data for the drive says
> that the current pending sector count is 57, but the reallocated
> sector count is 0.
> 
> So if I'm reading this correctly, there have been a bunch of instances
> of unreadable sectors, somehow only pertaining to one directory
> despite efforts to make an independent copy of the data (which should
> put it in random other sectors), but none of them have been confirmed
> bad. So at this point I'm guessing it's a systemd bug (and there are
> no actual bad sectors), but it would be great if someone could confirm
> that any of this actually makes sense before I file a bug report based
> on ignorance and speculation.

The errors are pretty low-level. So, this is either hard drive failing in
an obscure fashion (I/O errors do not necessarily imply "bad blocks") or a
strange filesystem problem.

Try running smart long test and a thorough fsck. If those come up clean, try
mounting another partition (or bind mounting a directory from another
partition) at /var/log/journal and see if errors persist. Finally, if /var is
a separate partition, I'd back it up, fill it with zeroes (cat /dev/zero>
/dev/sdXY), and if no errors are encountered, restore. Hopefully, the drive
firmware will relocate all bad sectors (if any).

Cheers,
L.
 
> 
> Thanks!
> 
> - J.M.
> 
> 
> 
> -------------------------------------------
> Example I/O error:
> 
> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata1.00: BMDMA stat 0x5
> ata1.00: failed command: READ DMA EXT
> ata1.00: cmd 25/00:08:c8:ca:92/00:00:2e:00:00/e0 tag 0 dma 4096 in
>          res 51/40:05:cb:ca:92/40:00:2e:00:00/ee Emask 0x9 (media error)
> ata1.00: status: { DRDY ERR }
> ata1.00: error: { UNC }
> ata1.00: configured for UDMA/100
> sd 0:0:0:0: [sda] Unhandled sense code
> sd 0:0:0:0: [sda]
> mResult: hostbyte=0x00 driverbyte=0x08
> sd 0:0:0:0: [sda]
> mSense Key : 0x3 [current] [descriptor]
> mDescriptor sense data with sense descriptors (in hex):
>         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
>         2e 92 ca cb
> sd 0:0:0:0: [sda]
> mASC=0x11 ASCQ=0x4
> sd 0:0:0:0: [sda] CDB:
> mcdb[0]=0x28: 28 00 2e 92 ca c8 00 00 08 00
> end_request: I/O error, dev sda, sector 781372107
> ata1: EH complete



-- 
Leonid Isaev
GnuPG key: 0x164B5A6D
Fingerprint: C0DF 20D0 C075 C3F1 E1BE  775A A7AE F6CB 164B 5A6D
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://mailman.archlinux.org/pipermail/arch-general/attachments/20140130/29b3c1db/attachment.asc>


More information about the arch-general mailing list