[arch-projects] [netctl] netctl, cloud-init, and systemd

Tue Jun 18 08:06:47 UTC 2019

Hi again,

thank you both for your input! See comments inline:

On 6/17/19 10:20 PM, Jouke Witteveen via arch-projects wrote:
> On Mon, Jun 17, 2019 at 9:45 PM Erich Eckner via arch-projects
> <arch-projects at archlinux.org> wrote:
>>> In case you are not familiar with cloud-init, the idea is that you can
>>> build a single OS image that runs cloud-init on boot, and cloud-init
>>> will take care of such things as network configuration, so that the same
>>> image will work regardless of the network setup you choose for the cloud
>>> instance.
>>
>> Does cloud-init run before or after systemd? In other words: is it a
>> systemd unit of some kind or is it rather an init daemon itself which
>> chain-loads systemd?

Cloud-init comes with multiple systemd units and as such is is run by
systemd multiple times at different stages during the boot process. The
cloud-init wiki page has a rough overview:
https://wiki.archlinux.org/index.php/Cloud-init#Systemd_integration

>>> The current cloud-init implementation for Arch uses netctl [3]. The
>>> implementation is correct in such a way that it does indeed render the
>>> right netctl profile(s) and enables them. However there is a problem:
>>> they are not being started. AFAICT this is because cloud-init does this
>>> while the systemd boot is already in process, and changing the
>>> dependency graph (by adding new units) does not have any effect until
>>> the next run (everything works right on second boot). Note that I even
>>> tried having cloud-init run `systemd daemon-reload` after enabling the
>>> units, but it didn't help either.
>>
>> Did you try cloud-init to issue "systemctl start $unitname.service"
>> additionally to "systemctl enable $unitname.service"? This seems to me to
>> be the right way.

It might be worth taking another look at that, but let me quickly lay
out why I didn't try this yet: when cloud-init runs for the first time,
it goes through a bunch of plugins called "data sources", which will
probe different aspects of the environment to determine the cloud
provider it is running in, use that knowledge to retrieve
vendor-specific configuration details, and use that to write e.g.
network config, hostname, etc. The tricky part is that for example the
EC2 data source uses a magic IP to retrieve this config (see
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html#instancedata-data-retrieval),
and other data sources might do similar things. Hence, I was worried
that prematurely re-configuring the network might interfere with such
actions (the unit running this has "Before=network-pre.target").

However, if cloud-init first fetches all data and then configures things
it might not be a problem. I'll take a closer look at what is happening
there and maybe try to get a statement from the cloud-init folks.

>>> The reason I am posting this here is that this seems to be an issue due
>>> to the particular way netctl use systemd units. Since you don't know the
>>> names or the number of profiles (units) that will be generated during
>>> image creation, you cannot enable them at that time. But doing so during
>>> first boot does not seem to work.
>>
>> I would rather say it's due to the way, cloud-init uses systemd units: it
>> enables them, but that's only relevant for successive boots, so it should
>> rather enable and start them (systemd should still honor the dependencies
>> of the units and postpone the start to the point where all of the
>> dependencies are loaded, too).
>>
>>>
>>> Just for comparison, if one were to use e.g. systemd-networkd instead,
>>> you would just enable the systemd-networkd unit during image creation,
>>> cloud-init could generate the appropriate config for any number of
>>> devices, and when the unit starts it will do the right thing. Likewise
>>> on other distros, e.g. Debian with /etc/network/interfaces or such.
>>>
>>> Now, from my point of view, there could be several approaches to solve
>> this:
>>>
>>> 1. systemd supports updates of the dep graph during boot
>>> 2. support such a use case in netctl
>>> 3. change cloud-init to use systemd-networkd for Arch
>>>
>>> Let me quickly elaborate:
>>>
>>> 1. is intentionally not phrased as something to be done. It might
>>> already be a thing, I just couldn't figure out how to do it. If someone
>>> knows more about this, I would love to hear about it. If this works, it
>>> would be the easiest solution. However, if it doesn't, I don't have my
>>> hopes up high for this being added to systemd anytime soon.
>>
>> This would mean, if I "systemctl enable $some.service", it will be started
>> right away, too - probably not, what systemd devs want (at least it's
>> not, what systemd currently does).
> 
> `systemctl enable --now <SERVICE>` starts a service in addition to enabling it.

Might be an option, see above.

>>> 2. is the main reason I am writing this. Things that came to mind were
>>> another special unit (netctl-all?), or even just a well-defined
>>> interface to write devices into the state file, so that the plain netctl
>>> unit would work. I would be very interested to hear how such a thing
>>> sounds to you, the developers?
>>
>> There is currently netctl-auto at .service, but this requires to know the
>> interfaces in advance. Maybe the netctl devs can consider adding another
>> unit which is interface agnostic? "netctl-auto.service" maybe? (I'm not
>> familiar with netctl's interna - maybe this is not possible at all)
> 
> Indeed, there are two more options to achieve what I think you want.
> 1. Use "netctl-ifplugd@<INTERFACE>", see also: netctl.special(7). This
> requires ifplugd to be installed and takes all profiles for an
> interface into consideration, so you don't need to know the name of
> the profile in advance. Of course, you do need to know the name of the
> interface.

Same problem, the interface names are also unknown, but could this maybe
be combined with the "Using any interface" method?
https://wiki.archlinux.org/index.php/Netctl#Using_any_interface
I'll take a closer look at that as well.

> 2. Use "netctl(.service)", see also: netctl.special(7), and write the
> profile name to "/var/lib/netctl/netctl.state". This only works if
> cloud-init runs before systemd, or at least finishes before the netctl
> service is started.

That's pretty much what I was referring to with "the plain netctl unit",
except of course it's a service :/ This was a likely candidate for me, I
just noticed that one can override the location of the file via
environment variable, but I suppose considering that in code would be
easy. I just thought it would be even cleaner if there were a netctl
command for that (like `netctl store PROFILE` to force-store a
profile?). Is this something you would consider? :)

>>> 3. Is of course an option, but would require quite a bit of work in
>>> cloud-init. That work, if done right, might however at some point
>>> benefit other distros, should they be using systemd-networkd as well.
>>> The main reason I am also bringing this up that I was wondering if there
>>> are possibly any plans to abandon netctl anyways at some point in favor
>>> of distro-agnostic solutions (be it systemd-networkd or any other).
> 
> netctl is stable and I intend to keep maintaining it. It should work
> without adaptations on any linux distribution that uses systemd.

Ok, that's great to know, I'd prefer sticking to that then!

Thanks a lot again, this gives me a bit more stuff to work with, I'll
keep you posted, and hopefully we have a fully supported Arch in
cloud-init soon :)

Cheers,
Conrad