[arch-general] LVM + mdadm no longer works on boot
Hi, all. I'm running a small fileserver that has three SATA drives set up in RAID5 via mdadm. That RAID holds one LVM pv which is split up into several logical volumes. This setup has worked fine in the past, but with the lastest system update my LVM partitions are not getting discovered correctly, leading to the boot hanging until I manually run "vgchange -ay". After that, the boot proceeds as normal. It would appear that the latest lvm2 package is what causes the issue. Downgrading to 2.02.100-1 boots fine, whereas 2.02.103-1 hangs. So how exactly should I proceed from here? I'm trying to understand how systemd makes it all work together, but I'm rather confused by it all. Thanks, --Sean Greenslade
Am 15.10.2013 21:37, schrieb Sean Greenslade:
Hi, all. I'm running a small fileserver that has three SATA drives set up in RAID5 via mdadm. That RAID holds one LVM pv which is split up into several logical volumes. This setup has worked fine in the past, but with the lastest system update my LVM partitions are not getting discovered correctly, leading to the boot hanging until I manually run "vgchange -ay". After that, the boot proceeds as normal.
It would appear that the latest lvm2 package is what causes the issue. Downgrading to 2.02.100-1 boots fine, whereas 2.02.103-1 hangs.
So how exactly should I proceed from here? I'm trying to understand how systemd makes it all work together, but I'm rather confused by it all.
Are you assembling RAID and LVM in initrd? If so, what's your HOOKS line in mkinitcpio.conf? I've seen reports like this before (although most people said they were fixed with updates and had problems before 2.02.100). Sadly, I could never reproduce it, so I don't know how to debug it.
On Wed, Oct 16, 2013 at 10:55:43AM +0200, Thomas Bächler wrote:
Am 15.10.2013 21:37, schrieb Sean Greenslade:
Hi, all. I'm running a small fileserver that has three SATA drives set up in RAID5 via mdadm. That RAID holds one LVM pv which is split up into several logical volumes. This setup has worked fine in the past, but with the lastest system update my LVM partitions are not getting discovered correctly, leading to the boot hanging until I manually run "vgchange -ay". After that, the boot proceeds as normal.
It would appear that the latest lvm2 package is what causes the issue. Downgrading to 2.02.100-1 boots fine, whereas 2.02.103-1 hangs.
So how exactly should I proceed from here? I'm trying to understand how systemd makes it all work together, but I'm rather confused by it all.
Are you assembling RAID and LVM in initrd? If so, what's your HOOKS line in mkinitcpio.conf?
I've seen reports like this before (although most people said they were fixed with updates and had problems before 2.02.100). Sadly, I could never reproduce it, so I don't know how to debug it.
I can say with certainty that the madadm assembly happens in the initrd, but I can't find any log messages pertaining to the LVM scan, even on successful boot. There is the following line that occurs before the root pivot, and which is the line that breaks the boot with the latest lvm2: Oct 15 17:01:14 rat systemd[1]: Expecting device dev-mapper-raidgroup\x2ddata.device... Here's my mkinitcpio.conf lines: (works with downgrade, not with current): MODULES="dm_mod" HOOKS="base udev mdadm_udev autodetect modconf block lvm2 filesystems keyboard fsck" --Sean
On Wed, Oct 16, 2013 at 09:55:43PM -0400, Sean Greenslade wrote:
I can say with certainty that the madadm assembly happens in the initrd, but I can't find any log messages pertaining to the LVM scan, even on successful boot. There is the following line that occurs before the root pivot, and which is the line that breaks the boot with the latest lvm2:
Oct 15 17:01:14 rat systemd[1]: Expecting device dev-mapper-raidgroup\x2ddata.device...
Here's my mkinitcpio.conf lines: (works with downgrade, not with current):
MODULES="dm_mod" HOOKS="base udev mdadm_udev autodetect modconf block lvm2 filesystems keyboard fsck"
--Sean
And now, after another system update, the problem has vanished. There was a kernel update, so I'm willing to believe that it was just some strange transient interaction between LVM, mdadm and the initramfs kernel. Just another day in the life of Arch, I suppose. Thanks for the attention. --Sean
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Am 21.10.2013 03:34, schrieb Sean Greenslade:
And now, after another system update, the problem has vanished. There was a kernel update, so I'm willing to believe that it was just some strange transient interaction between LVM, mdadm and the initramfs kernel. Just another day in the life of Arch, I suppose.
This is confusing. A common problem is that the mkinitcpio image is generated mid-upgrade. If some component of lvm is updated before the kernel and another is upgraded after, this may lead to broken images. We can't prevent this from happening right now, but OTOH I haven't seen it in a while. To be on the safe side, run 'mkinitcpio -P' after kernel updates. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJSZSQ4AAoJEChPw0yOSxolJBgP/3myZ6eNdLNZWBOIOKwGbjHF 38dwGkg6hlM8CaPvPsBvBJq6z9q7NFIYFLAiGSYeq3Im3MxQs0usFvC6hdMY0prV L4IsLd48Tma/sTgPwfgxv37GibAcJUb4a+hiWB2Cg1WFEQMPV93YTR01QV08FPFA ELpSpsIJLzctmeN0Y9uHbkFpJgdC1UgnkAuJ6mV9qlIZKNDGcHavnyyRtat9ZPUW VrhrieKER5ug4GDdj8ofLATJP3W+0cIHFPXy6kkNE6gGVD4wpIPBJh5XBuV6XNTD prBdX3Y5zggLh0p+UR5RV6cDFy8Kd0VW7BVEzm16xuFZn/Gln5UrA4gpt7Nl2+Tx cSQzpwHT9ro4asxfMoZkVMNzATRelBZTyuSAhVVjhdFXGW/q59BVMJTRzViKzZRh JkVi5eG8ACvjx1bkI6qTZ7zeZ1PtA8PdUAYQLFbcZSh0jd1DxNrQOlU2+VSwxx9u 3RsEFgx6PKdEEW2GFpIlgzXFmS1EpuKg0/q0tilwkxo4gdVxE8DCnaHc1cYJvTAn IallrzriD7vkVEcBDBR2d6wXuImMPz+GUAx4h/Q+tMmNMaaw2IgVfdOPD4AuFqec PyxeWGe7X5RAFOW/ZuVVQLR6RKXKQQXwEW9AVW+BmMowfrZP/2gt8qSE2krCT0EJ yQ5V5UsXf1BO/548wbKA =TNxu -----END PGP SIGNATURE-----
On Mon, Oct 21, 2013 at 02:55:20PM +0200, Thomas Bächler wrote:
Am 21.10.2013 03:34, schrieb Sean Greenslade:
And now, after another system update, the problem has vanished. There was a kernel update, so I'm willing to believe that it was just some strange transient interaction between LVM, mdadm and the initramfs kernel. Just another day in the life of Arch, I suppose.
This is confusing.
A common problem is that the mkinitcpio image is generated mid-upgrade. If some component of lvm is updated before the kernel and another is upgraded after, this may lead to broken images. We can't prevent this from happening right now, but OTOH I haven't seen it in a while.
To be on the safe side, run 'mkinitcpio -P' after kernel updates.
That would make sense. I didn't try enough downgrade-upgrade cycles to test that theory. I'll add that to the list of things to do when updating. --Sean
Am 22.10.2013 05:46, schrieb Sean Greenslade:
On Mon, Oct 21, 2013 at 02:55:20PM +0200, Thomas Bächler wrote:
Am 21.10.2013 03:34, schrieb Sean Greenslade:
And now, after another system update, the problem has vanished. There was a kernel update, so I'm willing to believe that it was just some strange transient interaction between LVM, mdadm and the initramfs kernel. Just another day in the life of Arch, I suppose.
This is confusing.
A common problem is that the mkinitcpio image is generated mid-upgrade. If some component of lvm is updated before the kernel and another is upgraded after, this may lead to broken images. We can't prevent this from happening right now, but OTOH I haven't seen it in a while.
To be on the safe side, run 'mkinitcpio -P' after kernel updates.
That would make sense. I didn't try enough downgrade-upgrade cycles to test that theory. I'll add that to the list of things to do when updating.
For the future, pacman hooks ([1]) may solve that problem without user interaction, but as far as I know nothing has been implemented yet. [1] https://wiki.archlinux.org/index.php?title=User:Allan/Pacman_Hooks
participants (2)
-
Sean Greenslade
-
Thomas Bächler