[arch-projects] [mkinitcpio] [PATCH 2/2] resume: generate a udev rule for specifying device

Tue Nov 26 10:22:26 EST 2013

On Tue, Nov 26, 2013 at 10:44:48AM +0100, Thomas Bächler wrote:
> Am 24.11.2013 03:01, schrieb Dave Reisner:
> > This solves the problem of the major:minor of devices changing between
> > hibernation. It also makes initialization lazy, as we no longer need to
> > explicitly wait on the device to show up (this will be taken care of by
> > udevadm settle).
> > 
> > Note that this patch drops support for tux-on-ice. As upstream
> > developerment of TOI seems stalled and forever relegated to being
> > maintained out of tree, I'm okay with this.
> > 
> > Signed-off-by: Dave Reisner <dreisner at archlinux.org>
> > ---
> > So, I tested this as far as ensuring that the udev rule is generated correctly
> > and triggers, writing the correct values into /sys/power/resume, but I can't
> > actually hibernate/resume my desktop to test this (mobo firmware bug).
> 
> Your firmware should not be involved at all in hibernation. As far as
> your firmware is concerned, you shut down and reboot.
> 

Whatever it is, the shutdown never completes, and the reboot isn't a
resume.

> > If anyone who uses this hook wants to test this, please do and report back.
> 
> I feel this approach is extremely dangerous due to the fragile state of
> file systems during hibernation.
> 
> Note that the file system must not change at all while the system is
> hibernated. The kernel's internal state is saved in the hibernation
> image - if the file system changed while the system was hibernated, the
> on-disk and in-memory state will be inconsistent, almost guaranteeing
> corruption (I have experienced this first-hand).
> 
> Now, your new approach does not preserve the ordering: Trying to resume
> from a hibernation image MUST happen BEFORE any of the hibernated
> system's mounted file systems are touched.
> 
> Now imagine the following situation: You have two hard drives, one
> holding your root file system, the other holding your swap (and thus
> hibernation image). Now imagine the following order of events:
> 
> * Linux loads, starts /init
> * Udev is started
> * Hard drive A is detected.
> * fsck is started, repairs the "dirty" root file system (and changes
> on-disk structures, clears the journal, ...)

Who/what triggered fsck here? Why is fsck being run before the udev
event queue is flushed?

> * Hard drive B is detected -> udev start the resume procedure.

...

> Now, you have a casual file system corruption. This should easily be
> reproducible by putting your swap on a hard drive and the root file
> system on SSD. To reproduce this more reliably, but the swap on USB instead.
> 
> This patch gets as many -1's from me as I can find. Hibernation is
> dangerous for your data as it is, this patch plays russian data roulette.

I can understand the USB case and how it might be bad news if you
plugged in a matching resume device at some arbitrary point during
normal operation, but I don't really think your earlier posted order of
events paints a realistic picture of what happens in early userspace.

I'll ask Harald about this -- Dracut uses a very similar mechanism (but
a bit more complex/convoluted).