[arch-releng] regression with race condition on pxe boot with nbd
Hello everybody, with my latest builds I see a regression on pxe boot with nbd. About 50% of boots fail. The nbd module is loaded, nbd-client attaches the device, but mount fails: mount: you must specify the filesystem type ERROR; Failed to mount '/dev/nbd0' Falling back to interactive prompt You can try to fix the problem manually, log out when you are finished A simple mount allows to continue boot: mount /dev/nbd0 /run/archiso/bootmnt/ <Ctrl>-d My guess is that linux 4.6 introduced a race condition. Any idea how to fix or handle this? -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];) putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
Christian Hesse on Tue, 2016/05/24 14:35:
Hello everybody,
with my latest builds I see a regression on pxe boot with nbd. About 50% of boots fail. The nbd module is loaded, nbd-client attaches the device, but mount fails:
mount: you must specify the filesystem type ERROR; Failed to mount '/dev/nbd0' Falling back to interactive prompt You can try to fix the problem manually, log out when you are finished
A simple mount allows to continue boot:
mount /dev/nbd0 /run/archiso/bootmnt/ <Ctrl>-d
My guess is that linux 4.6 introduced a race condition. Any idea how to fix or handle this?
Looks like adding a boot parameter nbd.nbds_max=2 fixes this (or makes it a lot less likely to happen). However this is still more of a workaround... -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];) putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
On 05/24/16 09:54, Christian Hesse wrote:
Christian Hesse
on Tue, 2016/05/24 14:35:
Hello everybody,
with my latest builds I see a regression on pxe boot with nbd. About 50% of boots fail. The nbd module is loaded, nbd-client attaches the device, but mount fails:
mount: you must specify the filesystem type ERROR; Failed to mount '/dev/nbd0' Falling back to interactive prompt You can try to fix the problem manually, log out when you are finished
A simple mount allows to continue boot:
mount /dev/nbd0 /run/archiso/bootmnt/ <Ctrl>-d
My guess is that linux 4.6 introduced a race condition. Any idea how to fix or handle this?
Looks like adding a boot parameter
nbd.nbds_max=2
fixes this (or makes it a lot less likely to happen). However this is still more of a workaround...
Hi Christian Did you try booting with earlymodules=nbd if goes better? Thanks for doing a good job here.
Gerardo Exequiel Pozzi
On 05/24/16 09:54, Christian Hesse wrote:
Christian Hesse
on Tue, 2016/05/24 14:35:
Hello everybody,
with my latest builds I see a regression on pxe boot with nbd. About 50% of boots fail. The nbd module is loaded, nbd-client attaches the device, but mount fails:
mount: you must specify the filesystem type ERROR; Failed to mount '/dev/nbd0' Falling back to interactive prompt You can try to fix the problem manually, log out when you are finished
A simple mount allows to continue boot:
mount /dev/nbd0 /run/archiso/bootmnt/ <Ctrl>-d
My guess is that linux 4.6 introduced a race condition. Any idea how to fix or handle this?
Looks like adding a boot parameter
nbd.nbds_max=2
fixes this (or makes it a lot less likely to happen). However this is still more of a workaround...
Hi Christian
Did you try booting with earlymodules=nbd if goes better?
Thanks for doing a good job here.
Yes, looks like earlymodules=nbd works as well. What's the best way to get this into the scripts? Or should we just move modprobe to run_earlyhook()? -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];) putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
From: Christian Hesse
From: Christian Hesse
On 05/25/16 16:35, Christian Hesse wrote:
From: Christian Hesse
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_nbd | 10 +++++++---
Hola! I am thinking in release another archiso version with these changes before next ISO, do you have more patches pending? Thanks.
From: Christian Hesse
From: Christian Hesse
On 05/26/16 18:53, Christian Hesse wrote:
From: Christian Hesse
Booting from iPXE we can set bootif_mac without having BOOTIF around.
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_common | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/archiso/initcpio/hooks/archiso_pxe_common b/archiso/initcpio/hooks/archiso_pxe_common index 66eecfa..cedf585 100644 --- a/archiso/initcpio/hooks/archiso_pxe_common +++ b/archiso/initcpio/hooks/archiso_pxe_common @@ -10,9 +10,12 @@ run_hook () { # /tmp/net-*.conf
if [[ -n "${ip}" ]]; then - if [[ -n "${BOOTIF}" ]]; then + if [[ -z "${bootif_mac}" && -n "${BOOTIF}" ]]; then bootif_mac=${BOOTIF#01-} bootif_mac=${bootif_mac//-/:} + fi + + if [[ -n "${bootif_mac}" ]]; then for i in /sys/class/net/*/address; do read net_mac < ${i} if [[ "${bootif_mac}" == "${net_mac}" ]]; then
If bootit_mac becomes a new cmdline paramteter, please add to docs ;) Is not a bit redundant? User can set BOOTIF= at syslinux prompt. what is the advantage here?
Gerardo Exequiel Pozzi
On 05/26/16 18:53, Christian Hesse wrote:
From: Christian Hesse
Booting from iPXE we can set bootif_mac without having BOOTIF around.
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_common | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/archiso/initcpio/hooks/archiso_pxe_common b/archiso/initcpio/hooks/archiso_pxe_common index 66eecfa..cedf585 100644 --- a/archiso/initcpio/hooks/archiso_pxe_common +++ b/archiso/initcpio/hooks/archiso_pxe_common @@ -10,9 +10,12 @@ run_hook () { # /tmp/net-*.conf
if [[ -n "${ip}" ]]; then - if [[ -n "${BOOTIF}" ]]; then + if [[ -z "${bootif_mac}" && -n "${BOOTIF}" ]]; then bootif_mac=${BOOTIF#01-} bootif_mac=${bootif_mac//-/:} + fi + + if [[ -n "${bootif_mac}" ]]; then for i in /sys/class/net/*/address; do read net_mac < ${i} if [[ "${bootif_mac}" == "${net_mac}" ]]; then
If bootit_mac becomes a new cmdline paramteter, please add to docs ;)
Is not a bit redundant? User can set BOOTIF= at syslinux prompt. what is the advantage here?
Thinking about this... Just drop the patch. It does not matter what format I give to BOOTIF. So I can use the pxelinux version with hardware type prefix and mac address including dashes: BOOTIF=01-88-99-aa-bb-cc-dd Or give the mac address directly: BOOTIF=88:99:aa:bb:cc:dd Right? So I will adjust my boot parameters to always use BOOTIF=. -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];) putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
On 05/27/16 03:44, Christian Hesse wrote:
Gerardo Exequiel Pozzi
on Thu, 2016/05/26 21:09: On 05/26/16 18:53, Christian Hesse wrote:
From: Christian Hesse
Booting from iPXE we can set bootif_mac without having BOOTIF around.
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_common | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/archiso/initcpio/hooks/archiso_pxe_common b/archiso/initcpio/hooks/archiso_pxe_common index 66eecfa..cedf585 100644 --- a/archiso/initcpio/hooks/archiso_pxe_common +++ b/archiso/initcpio/hooks/archiso_pxe_common @@ -10,9 +10,12 @@ run_hook () { # /tmp/net-*.conf
if [[ -n "${ip}" ]]; then - if [[ -n "${BOOTIF}" ]]; then + if [[ -z "${bootif_mac}" && -n "${BOOTIF}" ]]; then bootif_mac=${BOOTIF#01-} bootif_mac=${bootif_mac//-/:} + fi + + if [[ -n "${bootif_mac}" ]]; then for i in /sys/class/net/*/address; do read net_mac < ${i} if [[ "${bootif_mac}" == "${net_mac}" ]]; then
If bootit_mac becomes a new cmdline paramteter, please add to docs ;)
Is not a bit redundant? User can set BOOTIF= at syslinux prompt. what is the advantage here?
Thinking about this... Just drop the patch. It does not matter what format I give to BOOTIF. So I can use the pxelinux version with hardware type prefix and mac address including dashes:
BOOTIF=01-88-99-aa-bb-cc-dd
Or give the mac address directly:
BOOTIF=88:99:aa:bb:cc:dd
Right? So I will adjust my boot parameters to always use BOOTIF=.
Yes, both forms are valid :) The only warning here is when using dash-form, always append 01- first if your MAC start with 01, otherwise will be considered as hardware type [HTYPE] (01 for Ethernet) This is also valid, but not recommended. BOOTIF=88-99-aa-bb-cc-dd
From: Christian Hesse
On 05/26/16 18:53, Christian Hesse wrote:
From: Christian Hesse
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_common | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/archiso/initcpio/hooks/archiso_pxe_common b/archiso/initcpio/hooks/archiso_pxe_common index cedf585..adadefc 100644 --- a/archiso/initcpio/hooks/archiso_pxe_common +++ b/archiso/initcpio/hooks/archiso_pxe_common @@ -39,6 +39,12 @@ run_hook () {
pxeserver=${ROOTSERVER}
+ # If neither BOOTIF nor bootif_mac have been set from bootloader we do + # not know the boot interface, yet. Get it from ipconfig output now. + if [[ -z "${bootif_dev}" ]]; then + bootif_dev="${DEVICE}" + fi + # setup DNS resolver if [[ "${IPV4DNS0}" != "0.0.0.0" ]]; then echo "nameserver ${IPV4DNS0}" > /etc/resolv.conf
I guess this is not needed (not that you know about BOOTIF=), right?
Gerardo Exequiel Pozzi
On 05/26/16 18:53, Christian Hesse wrote:
From: Christian Hesse
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_common | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/archiso/initcpio/hooks/archiso_pxe_common b/archiso/initcpio/hooks/archiso_pxe_common index cedf585..adadefc 100644 --- a/archiso/initcpio/hooks/archiso_pxe_common +++ b/archiso/initcpio/hooks/archiso_pxe_common @@ -39,6 +39,12 @@ run_hook () {
pxeserver=${ROOTSERVER}
+ # If neither BOOTIF nor bootif_mac have been set from bootloader we do + # not know the boot interface, yet. Get it from ipconfig output now. + if [[ -z "${bootif_dev}" ]]; then + bootif_dev="${DEVICE}" + fi + # setup DNS resolver if [[ "${IPV4DNS0}" != "0.0.0.0" ]]; then echo "nameserver ${IPV4DNS0}" > /etc/resolv.conf
I guess this is not needed (not that you know about BOOTIF=), right?
My setup works without now and users of pxelinux and iPXE are fine. Are there any other pxe boot loaders that do not support giving mac address via boot parameter? -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];) putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
From: Christian Hesse
Christian Hesse on Thu, 2016/05/26 23:53:
From: Christian Hesse
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_common | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/archiso/initcpio/hooks/archiso_pxe_common b/archiso/initcpio/hooks/archiso_pxe_common index adadefc..1a9fe9d 100644 --- a/archiso/initcpio/hooks/archiso_pxe_common +++ b/archiso/initcpio/hooks/archiso_pxe_common @@ -1,7 +1,8 @@ # vim: set ft=sh:
run_hook () { - local i net_mac bootif_mac bootif_dev + # Do *not* declare 'bootif_dev' local! We need it in run_latehook(). + local i net_mac bootif_mac # These variables will be parsed from /tmp/net-*.conf generated by ipconfig local DEVICE local IPV4ADDR IPV4BROADCAST IPV4NETMASK IPV4GATEWAY IPV4DNS0 IPV4DNS1
This one is most important. :D Did you miss it? -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];) putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
On 05/27/16 15:01, Christian Hesse wrote:
Christian Hesse
on Thu, 2016/05/26 23:53:
From: Christian Hesse
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_common | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/archiso/initcpio/hooks/archiso_pxe_common b/archiso/initcpio/hooks/archiso_pxe_common index adadefc..1a9fe9d 100644 --- a/archiso/initcpio/hooks/archiso_pxe_common +++ b/archiso/initcpio/hooks/archiso_pxe_common @@ -1,7 +1,8 @@ # vim: set ft=sh:
run_hook () { - local i net_mac bootif_mac bootif_dev + # Do *not* declare 'bootif_dev' local! We need it in run_latehook(). + local i net_mac bootif_mac # These variables will be parsed from /tmp/net-*.conf generated by ipconfig local DEVICE local IPV4ADDR IPV4BROADCAST IPV4NETMASK IPV4GATEWAY IPV4DNS0 IPV4DNS1
This one is most important. :D Did you miss it?
woops, confused with bootif_mac! pushing... ¡Gracias!
Gerardo Exequiel Pozzi
On 05/25/16 16:35, Christian Hesse wrote:
From: Christian Hesse
Signed-off-by: Christian Hesse
--- archiso/initcpio/hooks/archiso_pxe_nbd | 10 +++++++--- Hola!
I am thinking in release another archiso version with these changes before next ISO, do you have more patches pending?
Bringing down a network interface in copy-to-ram mode has been broken since... ever. (And flushing broke with e018653a.) I investigated and prepared four more patches. That's it for now I think. -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Best regards my address: */=0;b=c[a++];) putchar(b-1/(/* Chris cc -ox -xc - && ./x */b/42*2-3)*42);}
participants (2)
-
Christian Hesse
-
Gerardo Exequiel Pozzi