[arch-general] A question about handling a system with two wired network interfaces?
I have a system with two ethernet sockets on the motherboard, and I have until very recently been finding that my network at random failed to come up during the boot process. I have the ethernet cable plugged into only one of the two sockets, and assign the names to the interfaces as eno1 and eno2 - because left to its own it assigned one of them as name "eno1" and the other to "eth0" or "eth1" at random between boots! I was surprised that it chose ethX at all for the second name, given this is running with the new naming scheme! However following the hints at: https://wiki.archlinux.org/index.php/Udev#Network_device I specify that udev fix the two interface names as eno1 and eno2 - but still the network failed to come up sometimes at boot. It seemed to be pure chance as to whether it came up on any particular boot. I tried both NetworkManager as well as ifplugd/netcfg and both seemed to behave the same way - at any one time I only had one or other scheme for getting the network up. What I then presumed was that although the same two names are always assigned, the hardware mac address to which of the two physical network ports was being assigned was being assigned differently to the two names - i.e. sometimes being en01 and sometimes being eno2. Why this matters is that I assign a static ip address to eno1 only and don't use eno2. This would only be an issue if there was more than one NIC in the system. The system boots really fast as it has SSD drives, which possibly was not allowing enough time for the name assignment, so I then found the advice at: https://wiki.archlinux.org/index.php/Rename_network_interfaces where it suggests editing the systemd unit file for in my case NetworkManager service - which I then did and this finally seems to have fixed the problem though I will still need to boot over a period of days to check if the network comes up "every" time now. The wiki page refers to the network service file, but I applied this to the NetworkManager service file. So the question that I hope someone can answer is whether in the case of multiple NICs in a system the new network names, although the same "set" of names is used for all the cards in the system, will still need this systemd unit file amendment - i.e. adding the two lines: Requires=systemd-udev-settle.service After=systemd-udev-settle.service to the [unit] section in order to get the correct name associated with the same hardware mac address every boot? If someone who understands these things in detail could confirm if my understanding is correct I would appreciate it? Many thanks. -- mike c
Hi Mike,
On Tue, Feb 19, 2013 at 3:32 PM, Mike Cloaked
I have a system with two ethernet sockets on the motherboard, and I have until very recently been finding that my network at random failed to come up during the boot process.
Just to be clear: the problem is still occurring (I am confused by "until recently")? Was it brought on by the recent udev naming change, or is it a long-standing problem?
I have the ethernet cable plugged into only one of the two sockets, and assign the names to the interfaces as eno1 and eno2 - because left to its own it assigned one of them as name "eno1" and the other to "eth0" or "eth1" at random between boots! I was surprised that it chose ethX at all for the second name, given this is running with the new naming scheme!
What you are seeing is that udev is able to reliably name one of your devices (eno1), which (unless you add any custom rules) will always refer to the same network port between reboots (in particular, it refers to the first on-board network device). However, the other device udev is not able to reliably name, so it leaves it alone. This explains why it still has a name in the kernel namespace (ethX), and why it alternates between eth0 and eth1.
However following the hints at:
https://wiki.archlinux.org/index.php/Udev#Network_device
I specify that udev fix the two interface names as eno1 and eno2
It is probably better to use different names than the enoX ones (though not strictly speaking necessary) if you specify them using your custom scheme (as outlined in the wiki article you link to). The reason is that the enoX names have a specific meaning (they are the on-board devices in the order given by the hardware). Currently, udev will not give out stable interface names based on mac addresses even though it can. To overcome this, you could put this into /etc/udev/rules/80-net-name-slot.rules: ACTION=="remove", GOTO="net_name_slot_end" SUBSYSTEM!="net", GOTO="net_name_slot_end" NAME!="", GOTO="net_name_slot_end" NAME=="", ENV{ID_NET_NAME_ONBOARD}!="", NAME="$env{ID_NET_NAME_ONBOARD}" NAME=="", ENV{ID_NET_NAME_SLOT}!="", NAME="$env{ID_NET_NAME_SLOT}" NAME=="", ENV{ID_NET_NAME_PATH}!="", NAME="$env{ID_NET_NAME_PATH}" NAME=="", ENV{ID_NET_NAME_MAC}!="", NAME="$env{ID_NET_NAME_MAC}" LABEL="net_name_slot_end" That should, if the MAC address is reliable, name your ethX device as something like "enx98fe023fa538". Note, however, that udev will try to first ascertain if the MAC address can be trusted (for instance, some machines will randomly assign the MAC at every boot), so if this does not work for you it (probably) means that udev decided not to trust the MAC address so did not export the name.
What I then presumed was that although the same two names are always assigned, the hardware mac address to which of the two physical network ports was being assigned was being assigned differently to the two names - i.e. sometimes being en01 and sometimes being eno2. Why this matters is that I assign a static ip address to eno1 only and don't use eno2. This would only be an issue if there was more than one NIC in the system.
The system boots really fast as it has SSD drives, which possibly was not allowing enough time for the name assignment, so I then found the advice at:
https://wiki.archlinux.org/index.php/Rename_network_interfaces
where it suggests editing the systemd unit file for in my case NetworkManager service - which I then did and this finally seems to have fixed the problem though I will still need to boot over a period of days to check if the network comes up "every" time now. The wiki page refers to the network service file, but I applied this to the NetworkManager service file.
So the question that I hope someone can answer is whether in the case of multiple NICs in a system the new network names, although the same "set" of names is used for all the cards in the system, will still need this systemd unit file amendment - i.e. adding the two lines:
Requires=systemd-udev-settle.service After=systemd-udev-settle.service
to the [unit] section in order to get the correct name associated with the same hardware mac address every boot?
If someone who understands these things in detail could confirm if my understanding is correct I would appreciate it?
There has been some bugs where network software grab the device before udev can rename it, which could be solved with the workaround you reference above. However, I doubt this to be the case with NetworkManager, as far as I know it is only a problem with dhcpcd.service (but dhcpcd@<your network device>.service works correctly). HTH, Tom
On Tue, Feb 19, 2013 at 4:33 PM, Tom Gundersen
"until recently")? Was it brought on by the recent udev naming change, or is it a long-standing problem?
This is a new system that was installed very recently using the February archiso. So I have no past history of the problem - only current experience. What you are seeing is that udev is able to reliably name one of your
devices (eno1), which (unless you add any custom rules) will always refer to the same network port between reboots (in particular, it refers to the first on-board network device). However, the other device udev is not able to reliably name, so it leaves it alone. This explains why it still has a name in the kernel namespace (ethX), and why it alternates between eth0 and eth1.
Thank you for that explanation - it certainly helps to understand what is going on.
I specify that udev fix the two interface names as eno1 and eno2
It is probably better to use different names than the enoX ones (though not strictly speaking necessary) if you specify them using your custom scheme (as outlined in the wiki article you link to). The reason is that the enoX names have a specific meaning (they are the on-board devices in the order given by the hardware).
Currently, udev will not give out stable interface names based on mac addresses even though it can. To overcome this, you could put this into /etc/udev/rules/80-net-name-slot.rules:
ACTION=="remove", GOTO="net_name_slot_end" SUBSYSTEM!="net", GOTO="net_name_slot_end" NAME!="", GOTO="net_name_slot_end"
NAME=="", ENV{ID_NET_NAME_ONBOARD}!="", NAME="$env{ID_NET_NAME_ONBOARD}" NAME=="", ENV{ID_NET_NAME_SLOT}!="", NAME="$env{ID_NET_NAME_SLOT}" NAME=="", ENV{ID_NET_NAME_PATH}!="", NAME="$env{ID_NET_NAME_PATH}" NAME=="", ENV{ID_NET_NAME_MAC}!="", NAME="$env{ID_NET_NAME_MAC}"
LABEL="net_name_slot_end"
That should, if the MAC address is reliable, name your ethX device as something like "enx98fe023fa538". Note, however, that udev will try to first ascertain if the MAC address can be trusted (for instance, some machines will randomly assign the MAC at every boot), so if this does not work for you it (probably) means that udev decided not to trust the MAC address so did not export the name.
same hardware mac address every boot?
If someone who understands these things in detail could confirm if my understanding is correct I would appreciate it?
There has been some bugs where network software grab the device before udev can rename it, which could be solved with the workaround you reference above. However, I doubt this to be the case with NetworkManager, as far as I know it is only a problem with dhcpcd.service (but dhcpcd@<your network device>.service works correctly).
I will check which of the two is enabled - but I have since my last post now switched on dhcpd (server) and named services - after boot the dhcpd service had entered a failed state and I am now wondering if I need to add
The MAC addresses seem to be consistent - and the udev rule does seem to make it boot consistently once I had done that in combination with the systemd-udev-settle.service lines in the service file. ct name associated with the the systemd-udev-settle.service lines in the service file for dhcpd as well? Thanks -- mike c
On Tue, Feb 19, 2013 at 9:03 PM, Mike Cloaked
I will check which of the two is enabled - but I have since my last post now switched on dhcpd (server) and named services - after boot the dhcpd service had entered a failed state and I am now wondering if I need to add the systemd-udev-settle.service lines in the service file for dhcpd as well?
To add to the information about the named and dhcpd services - I just double checked after posting the previous reply - and although the named service "appears" to be running normally the dhcpd4 service is failed immediately after boot - and the DNS lookups don't work - however after the system is booted then doing the following in the order show gets the system working fine. systemctl restart dhcpd4 systemctl restart named Once that is done then both dns and dhcp server services are operating correctly - however I then have to log out of KDE and back in again as for example my weather applet within the KDE desktop won't restart until I have correctly got dhcpd4 and named restarted, and then logged out and back in. So I guess there are timing dependency issues for these two services as well. I would value any advice on a work around for this so that everything is working without manual intervention once the boot is complete. If it is any help the boot analysis gives: [root@home1 ~]# systemd-analyze blame 3023ms systemd-udev-settle.service 576ms postfix.service 139ms systemd-remount-fs.service 138ms NetworkManager.service 126ms tmp.mount 113ms sys-kernel-debug.mount 100ms dev-hugepages.mount 90ms systemd-udevd.service 87ms sys-kernel-config.mount 79ms systemd-udev-trigger.service 77ms dev-mqueue.mount 66ms boot-efi.mount 63ms iptables.service 57ms systemd-logind.service 45ms var-spool-mail.mount 36ms systemd-vconsole-setup.service 35ms udisks2.service 35ms polkit.service 31ms opt.mount 27ms chrony.service 24ms systemd-tmpfiles-setup.service 23ms dhcpd4.service 23ms systemd-sysctl.service 13ms rtkit-daemon.service 9ms systemd-user-sessions.service 8ms upower.service 1ms home.mount So the dhcpd4 service is fast compared to the NetworkManager service. Thanks in advance. Mike -- mike c
On Tue, Feb 19, 2013 at 9:27 PM, Mike Cloaked
So the dhcpd4 service is fast compared to the NetworkManager service.
I know this discussion is now about the incorrect start of dhcpd4 - I can
move this to a new topic title if necessary. However I looked at journalctl for the lines when dhcpd4 fails at boot and they are: Feb 19 21:37:01 home1 dhcpd[327]: Feb 19 21:37:01 home1 dhcpd[327]: Not configured to listen on any interfaces! Feb 19 21:37:01 home1 dhcpd[327]: Feb 19 21:37:01 home1 dhcpd[327]: If you did not get this software from ftp.isc.org, please Feb 19 21:37:01 home1 dhcpd[327]: get the latest from ftp.isc.org and install that before Feb 19 21:37:01 home1 dhcpd[327]: requesting help. Feb 19 21:37:01 home1 dhcpd[327]: Feb 19 21:37:01 home1 dhcpd[327]: If you did get this software from ftp.isc.org and have not Feb 19 21:37:01 home1 dhcpd[327]: yet read the README, please read it before requesting help. Feb 19 21:37:01 home1 dhcpd[327]: If you intend to request help from the dhcp-server@isc.org Feb 19 21:37:01 home1 dhcpd[327]: mailing list, please read the section on the README about Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno1): preparing device. Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno1): deactivating device (reason 'managed') [2] Feb 19 21:37:01 home1 NetworkManager[318]: <warn> failed to allocate link cache: (-10) Operation not supported Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno2): carrier is OFF Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno2): new Ethernet device (driver: 'e1000e' ifindex: 3) Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno2): exported as /org/freedesktop/NetworkManager/Devices/1 Feb 19 21:37:01 home1 dhcpd[327]: submitting bug reports and requests for help. Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno2): now managed Feb 19 21:37:01 home1 dhcpd[327]: Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno2): device state change: unmanaged -> unavailable (reason 'managed') [10 20 2] Feb 19 21:37:01 home1 dhcpd[327]: Please do not under any circumstances send requests for Feb 19 21:37:01 home1 NetworkManager[318]: <info> (eno2): bringing up device. Feb 19 21:37:01 home1 dhcpd[327]: help directly to the authors of this software - please Feb 19 21:37:01 home1 dhcpd[327]: send them to the appropriate mailing list as described in Feb 19 21:37:01 home1 dhcpd[327]: the README file. Feb 19 21:37:01 home1 dhcpd[327]: Feb 19 21:37:01 home1 dhcpd[327]: exiting. Feb 19 21:37:01 home1 kernel: e1000e 0000:00:19.0: irq 42 for MSI/MSI-X Feb 19 21:37:01 home1 systemd[1]: dhcpd4.service: control process exited, code=exited status=1 Feb 19 21:37:01 home1 kernel: IPv6: ADDRCONF(NETDEV_UP): eno1: link is not ready Feb 19 21:37:01 home1 systemd[1]: Failed to start IPv4 DHCP server. Feb 19 21:37:01 home1 systemd[1]: Unit dhcpd4.service entered failed state then when I restart it manually after boot it gives line: Feb 19 21:38:59 home1 systemd[1]: Starting IPv4 DHCP server... Feb 19 21:38:59 home1 dhcpd[776]: Wrote 0 deleted host decls to leases file. Feb 19 21:38:59 home1 dhcpd[776]: Wrote 0 new dynamic host decls to leases file. Feb 19 21:38:59 home1 dhcpd[776]: Wrote 1 leases to leases file. Feb 19 21:38:59 home1 systemd[1]: Started IPv4 DHCP server. -- mike c
Hi Mike, Lots of stuff going on, so sorry for not answering inline. * It looks like NetworkManager and dhcpd are stepping on eachother's toes. Maybe you want to disable dhcpd and only use NM? * Any service that cannot deal with network devices appearing or being rename after it is started (is broken and) should have the After/Wants=systemd-udev-settle.service lines you pasted above. Note that this is not a perfect solution, as udev never knows exactly how long to wait for all the network devices, but it is the best we can do and gives the same behavior as we had pre-systemd. * I never used the dhcpd4 (only ever used dhcpcd), but in principle the correct solution should be to make it one instance per network device (i.e., create an dhcpd4@.service similar to dhcpcd@.service from the dhcpcd package). This would make sure it is only started after the relevant network device appears (and have been given its final name), and it should also make dhpcd itself happy (as it seems to not like being called without a specific interface judging from your logs). HTH, Tom
On Tue, Feb 19, 2013 at 10:00 PM, Tom Gundersen
Hi Mike,
Lots of stuff going on, so sorry for not answering inline.
* It looks like NetworkManager and dhcpd are stepping on eachother's toes. Maybe you want to disable dhcpd and only use NM?
I am confused here - I am running dhcpd as a server not a client - I didn't realise NM could act as a DHCP server?
* Any service that cannot deal with network devices appearing or being rename after it is started (is broken and) should have the After/Wants=systemd-udev-settle.service lines you pasted above. Note that this is not a perfect solution, as udev never knows exactly how long to wait for all the network devices, but it is the best we can do and gives the same behavior as we had pre-systemd.
Perhaps I should file a bug about this - particularly with dhcpd?
* I never used the dhcpd4 (only ever used dhcpcd), but in principle the correct solution should be to make it one instance per network device (i.e., create an dhcpd4@.service similar to dhcpcd@.service from the dhcpcd package). This would make sure it is only started after the relevant network device appears (and have been given its final name), and it should also make dhpcd itself happy (as it seems to not like being called without a specific interface judging from your logs).
I will try to fiddle with the service files and get it to work - but clearly at the moment it is non-ideal as this is a server and a remote reboot would currently leave it without a working network connection! -- mike c
On Wed, Feb 20, 2013 at 9:05 AM, Mike Cloaked
I will try to fiddle with the service files and get it to work - but clearly at the moment it is non-ideal as this is a server and a remote reboot would currently leave it without a working network connection!
I have finally found a solution that works well for me - I simply did "systemctl enable NetworkManager-wait-online" and rebooted - the dhcpd4 and named services are now running correctly after the system boots - so the dhcpd4 service does indeed get held until the network is properly up so I now have a system working as it should and therefore this particular issue is resolved. systemd-analyze shows a boot time of 8.7 seconds so there is no significant delay by enabling NetworkManager-wait-online service. -- mike c
On Wed, Feb 20, 2013 at 5:37 PM, Mike Cloaked
I will try to fiddle with the service files and get it to work - but clearly at the moment it is non-ideal as this is a server and a remote reboot would currently leave it without a working network connection!
I have finally found a solution that works well for me - I simply did "systemctl enable NetworkManager-wait-online" and rebooted - the dhcpd4 and named services are now running correctly after the system boots - so the dhcpd4 service does indeed get held until the network is properly up so I now have a system working as it should and therefore this particular issue is resolved.
systemd-analyze shows a boot time of 8.7 seconds so there is no significant delay by enabling NetworkManager-wait-online service.
I mentioned your problems to the systemd guys, and they would like to know more. If you don't mind, could you post "lspci -vvv"? Cheers, Tom
On Thu, Feb 21, 2013 at 12:46 AM, Tom Gundersen
I mentioned your problems to the systemd guys, and they would like to know more. If you don't mind, could you post "lspci -vvv"?
I wasn't sure if I could attach a file so here is the output inline:
[mike@home1 dual-nic]$ cat lspci.txt
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
processor DRAM Controller (rev 09)
Subsystem: Intel Corporation Device 2036
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
participants (2)
-
Mike Cloaked
-
Tom Gundersen