[arch-general] Can I prevent Pacman from running hooks?
Hi Arch Community, I have scripts that will install a set of arch linux machines for me with all the tweaks I want. These scripts run pacman a lot to install bits and pieces. Usually the script will running pacman to install one package and then configure that package and then proceed to install the next (set of) packages. This works great, but a noticeable part of the run time is spent on running pacman hooks (e.g. to update the man-db and similar things). So I am wondering: Is it possible to stop pacman from running hooks during package installation? I do know when it is "safe" to not run hooks (when the system is not complete yet) and when I need to run all of them (right after the system has all packages installed that it will have). So far I run pacman with a --hookdir that contains symlinks to /dev/null named like some of the more expensive hooks that tend to take long to complete. But a --no-hooks option to pacman would be great for my use case. Is it possible to run pacman without it triggering hooks? Best Regards, Tobias
On Sat, 13 Oct 2018 10:11:10 +0200 Tobias Hunger via arch-general <arch-general@archlinux.org> wrote:
Hi Arch Community,
I have scripts that will install a set of arch linux machines for me with all the tweaks I want. These scripts run pacman a lot to install bits and pieces. Usually the script will running pacman to install one package and then configure that package and then proceed to install the next (set of) packages. This works great, but a noticeable part of the run time is spent on running pacman hooks (e.g. to update the man-db and similar things).
So I am wondering: Is it possible to stop pacman from running hooks during package installation? I do know when it is "safe" to not run hooks (when the system is not complete yet) and when I need to run all of them (right after the system has all packages installed that it will have).
So far I run pacman with a --hookdir that contains symlinks to /dev/null named like some of the more expensive hooks that tend to take long to complete. But a --no-hooks option to pacman would be great for my use case.
Is it possible to run pacman without it triggering hooks?
Best Regards, Tobias
Some hooks take into account the specific files that were installed, so you cannot run them later. Why not just install everything at once?
On Sat, Oct 13, 2018, 10:15 Doug Newgard via arch-general < arch-general@archlinux.org> wrote:
Some hooks take into account the specific files that were installed, so you cannot run them later.
I am aware of that. This is an optimization that avoids running hooks needlessly. All the hooks I read so far are safe to run at any time. Why not just install everything at once?
I run a immutable and stateless setup. So I can not actually update systems (they are immutable after all). So I end up having my CI generate images for my systems every couple of hours. Those will then replace the images I run eventually, getting me to a new updated system state. The CI builds a very basic system and configured that. It then creates more specialized systems based on that base image (e.g. one for VMs, one for containers and another one for bare metal). It then continues to Branch out from those till all the actual systems I want to install are reached. Overall this approach (somewhat I spired by docker files) saves a lot of time over just creating each system from scratch and it also makes sure all systems have the same basic features: They all I before them from a common base after all. It would suffice to run all hooks in the leaves of the tree of systems (just before writing the actual HDD image files) and skip them for all others. Best Regards, Tobias
On 13/10/2018 13:12, Tobias Hunger via arch-general wrote:
I run a immutable and stateless setup. So I can not actually update systems (they are immutable after all). So I end up having my CI generate images for my systems every couple of hours. Those will then replace the images I run eventually, getting me to a new updated system state.
As a slight side-step, might it be possible to generate the images in a ramdisk/tmpfs? That should remove disk I/O as a bottleneck. Another option, given you know the expected end state, could be to bypass pacman and extract the package file content into place directly (e.g. with tar), then run whatever hooks you want afterwards. Yes, you'd lose package management within the images but that doesn't matter with an immutable image. (Though, there may be side-effects I haven't considered here.)
On Sat, Oct 13, 2018 at 2:25 PM Jonathon Fernyhough <jonathon@manjaro.org> wrote:
On 13/10/2018 13:12, Tobias Hunger via arch-general wrote:
I run a immutable and stateless setup. So I can not actually update systems (they are immutable after all). So I end up having my CI generate images for my systems every couple of hours. Those will then replace the images I run eventually, getting me to a new updated system state.
As a slight side-step, might it be possible to generate the images in a ramdisk/tmpfs? That should remove disk I/O as a bottleneck.
That is possible.
Another option, given you know the expected end state, could be to bypass pacman and extract the package file content into place directly (e.g. with tar), then run whatever hooks you want afterwards. Yes, you'd lose package management within the images but that doesn't matter with an immutable image. (Though, there may be side-effects I haven't considered here.)
The price is loosing package management. I do want to keep that to drag in dependencies. Best Regards, Tobias
On Sat, 13 Oct 2018 14:12:11 +0200 Tobias Hunger via arch-general <arch-general@archlinux.org> wrote:
It would suffice to run all hooks in the leaves of the tree of systems (just before writing the actual HDD image files) and skip them for all others.
Again, no, it wouldn't. The hooks would not run correctly.
On Sat, Oct 13, 2018 at 5:15 PM Doug Newgard via arch-general <arch-general@archlinux.org> wrote:
On Sat, 13 Oct 2018 14:12:11 +0200 Tobias Hunger via arch-general <arch-general@archlinux.org> wrote:
It would suffice to run all hooks in the leaves of the tree of systems (just before writing the actual HDD image files) and skip them for all others.
Again, no, it wouldn't. The hooks would not run correctly.
What makes you say so? I see nothing in the alpm-hooks man page that implies that this would not work. Best Regards, Tobias
On Sat, 13 Oct 2018 18:41:45 +0200 Tobias Hunger via arch-general <arch-general@archlinux.org> wrote:
On Sat, Oct 13, 2018 at 5:15 PM Doug Newgard via arch-general <arch-general@archlinux.org> wrote:
On Sat, 13 Oct 2018 14:12:11 +0200 Tobias Hunger via arch-general <arch-general@archlinux.org> wrote:
It would suffice to run all hooks in the leaves of the tree of systems (just before writing the actual HDD image files) and skip them for all others.
Again, no, it wouldn't. The hooks would not run correctly.
What makes you say so?
I see nothing in the alpm-hooks man page that implies that this would not work.
Best Regards, Tobias
Because, as I said earlier, hooks can and do take into account the specific files being installed. If you install one package that needs a specific hook, running that hook later will have the correct file list and will not run correctly.
On Sat, 13 Oct 2018 11:44:37 -0500 Doug Newgard via arch-general <arch-general@archlinux.org> wrote:
On Sat, 13 Oct 2018 18:41:45 +0200 Tobias Hunger via arch-general <arch-general@archlinux.org> wrote:
On Sat, Oct 13, 2018 at 5:15 PM Doug Newgard via arch-general <arch-general@archlinux.org> wrote:
On Sat, 13 Oct 2018 14:12:11 +0200 Tobias Hunger via arch-general <arch-general@archlinux.org> wrote:
It would suffice to run all hooks in the leaves of the tree of systems (just before writing the actual HDD image files) and skip them for all others.
Again, no, it wouldn't. The hooks would not run correctly.
What makes you say so?
I see nothing in the alpm-hooks man page that implies that this would not work.
Best Regards, Tobias
Because, as I said earlier, hooks can and do take into account the specific files being installed. If you install one package that needs a specific hook, running that hook later will have the correct file list and will not run correctly.
That should read "will not have the correct file list"
On Sat, Oct 13, 2018 at 6:44 PM Doug Newgard via arch-general <arch-general@archlinux.org> wrote:
Because, as I said earlier, hooks can and do take into account the specific files being installed. If you install one package that needs a specific hook, running that hook later will have the correct file list and will not run correctly.
Most hooks are just run and do not care for any input. Some of the hook scripts take a list from stdin. By the way: It would be nice if that was documented in the alpm-hooks man page. I see no reason why I can not generate this file list right when I want to run the hooks. In my setup I can ignore anything but the Install hooks. For those I just need to apply the glob patterns in the Target fields. That should not be too hard. Best Regards, Tobias
On 10/13/18 at 06:58pm, Tobias Hunger via arch-general wrote:
Some of the hook scripts take a list from stdin. By the way: It would be nice if that was documented in the alpm-hooks man page.
"NeedsTargets Causes the list of matched trigger targets to be passed to the running hook on stdin."
On 10/13/18 4:11 AM, Tobias Hunger via arch-general wrote:
I have scripts that will install a set of arch linux machines for me with all the tweaks I want. These scripts run pacman a lot to install bits and pieces. Usually the script will running pacman to install one package and then configure that package and then proceed to install the next (set of) packages. This works great, but a noticeable part of the run time is spent on running pacman hooks (e.g. to update the man-db and similar things).
If there are specific hooks you don't want to use, you can use the default HookDir (as documented in alpm-hooks(5) *and* in pacman.conf(5), this is "/etc/pacman.d/hooks/") to mask them with symlinks to /dev/null. You've responded that you need to batch each image installation process using incremental runs inspired by docker, which I sort of understand, but I don't really see how delaying execution until the end, is a good idea here. Consider things like the texinfo package, which installs a hook to run many install-info processes, once for each file in usr/share/info/ that the hook detects. I guess you could write your own custom handling for this and just try to install the whole directory at the end, but you'd need to judge based on the hook, and adapt to a unique situation. And it's irrelevant, since delaying execution will not provide benefits over doing it every time a file is installed -- on the contrary, delaying execution means you repeat some of the base image work separately for each child image. You should override hooks by hand, when you know what they do, know that they don't require targets, and know that you're handling it yourself at the end.
So I am wondering: Is it possible to stop pacman from running hooks during package installation? I do know when it is "safe" to not run hooks (when the system is not complete yet) and when I need to run all of them (right after the system has all packages installed that it will have).
So far I run pacman with a --hookdir that contains symlinks to /dev/null named like some of the more expensive hooks that tend to take long to complete. But a --no-hooks option to pacman would be great for my use case.
Expensive hooks, like the man-db hook? I find that annoying on my standard system as well, which is why I use https://github.com/graysky2/mandb-ondemand I'm extremely skeptical that we'll add a --no-hooks option. We have a --noscriptlet option, but that's because there's no other way to stop a scriptlet from doing specific things you don't want done. Hooks were designed to be configurable and able to be masked on an individual level via symlinks. An option to prevent hooks from running would therefore serve no purpose except to say "running hooks at all is undesirable to me", which I don't think is the statement we want to say...
Is it possible to run pacman without it triggering hooks?
You can add a NoExtract directive to pacman.conf, which prevents hooks from being installed to the system in the first place. Although I don't know how you'd determine what hooks should exist, in order to handle their actions yourself. You can also compile your own pacman package using: CPPFLAGS='-DSYSHOOKDIR=\"/i/dont/want/to/run/any/hooks/\"' ./configure But I don't see any point to this. -- Eli Schwartz Bug Wrangler and Trusted User
participants (5)
-
Andrew Gregory
-
Doug Newgard
-
Eli Schwartz
-
Jonathon Fernyhough
-
Tobias Hunger