During a full system update of a Manjaro ARM installation, on an ARM-based computer, of course, I spotted a highly conspicuous error message in the output of pacman: ( 3/18) Creating temporary files... Assertion 'fd' failed at src/tmpfiles/tmpfiles.c:843, function fd_set_perms(). Aborting. /usr/share/libalpm/scripts/systemd-hook: line 28: 1735 Aborted (core dumped) /usr/bin/systemd-tmpfiles --create error: command failed to execute correctly Here's also an excerpt from /var/log/pacman.log (the stack trace was rather unusable, so I'll skip it for the sake of brevity): [ALPM] running '30-systemd-tmpfiles.hook'... [ALPM-SCRIPTLET] Assertion 'fd' failed at src/tmpfiles/tmpfiles.c:843, function fd_set_perms(). Aborting. [ALPM-SCRIPTLET] /usr/share/libalpm/scripts/systemd-hook: line 28: 1735 Aborted (core dumped) /usr/bin/systemd-tmpfiles --create Running "systemd-tmpfiles --create" manually afterwards resulted in no errors, which made the error message even more conspicuous. Thus, executing /usr/share/libalpm/hooks/30-systemd-tmpfiles.hook failed, but only when it was run from within pacman. I also saw a few people complaining about the same error message on Manjaro and Arch Linux forums, even with one complaint dating more than a few years ago, but nobody came through with a fix or workaround. After a detailed and rather lengthy investigation, it turned out that the root cause was twofold, as described below: 1) For its pacman package, Arch Linux ARM applies a patch named 0003-Revert-alpm_run_chroot-always-connect-parent2child-p.patch that reverts rather old pacman commit 1d6583a5, for an unknown reason. This patch causes the error like clockwork. 2) The code in pacman's lib/libalpm/util.c that executes the hooks by forking a child has some rather subtle bugs that allow the error to occur under certain circumstances. Regarding the first point, the patch from Arch Linux ARM creates a condition in which the file descriptor 0 is closed by calling close(0) and left closed when the executed hook has no option "NeedsTargets" specified, which is the case for the hook mentioned above, /usr/share/libalpm/hooks/30-systemd-tmpfiles.hook. As a result, the first call to open() during the execution of the hook returns 0 as the file descriptor, because 0 is the lowest currently available value. It would all go unnoticed, but systemd-tmpfiles performed assert(fd) checks in its file src/tmpfiles/tmpfiles.c, which failed because fd equaled 0. These checks seem to have been removed in the meantime, which effectively made the error go away, but the original issue still remains. Regarding the second point, function _alpm_run_chroot() that executes hooks in a fork()ed child does not execute dup2() properly, but instead executes close() followed by dup2(). The man page for dup2() clearly states that such attempts to re-implement the equivalent functionality must be avoided, as visible in this quotation: The dup2() system call performs the same task as dup(), but instead of using the lowest-numbered unused file descriptor, it uses the file descriptor number specified in newfd. In other words, the file descriptor newfd is adjusted so that it now refers to the same open file description as oldfd. If the file descriptor newfd was previously open, it is closed before being reused; the close is performed silently (i.e., any errors during the close are not reported by dup2()). The steps of closing and reusing the file descriptor newfd are performed atomically. This is important, because trying to implement equivalent functionality using close(2) and dup() would be subject to race conditions, whereby newfd might be reused between the two steps. Such reuse could happen because the main program is interrupted by a signal handler that allocates a file descriptor, or because a parallel thread allocates a file descriptor. As a result, a condition can occur in which the file descriptor 0 is closed by calling close(0), and left closed after the while loop that fails to execute dup2() because of receiving EBUSY, resulting in the original issue. On top of that, failed attempts to execute dup2() should be treated as fatal errors instead of being silently ignored. Let's improve the code to prevent the issues described in the second point, while not applying the above-mentioned Arch Linux ARM package patch fixes the issues decribed in the first point. While there, perform a minor cleanup as well, to make the formatting of the code a tiny bit more consistent. Signed-off-by: Dragan Simic <dsimic@manjaro.org> --- lib/libalpm/util.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index dffa3b51..97e87c6c 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -639,28 +639,39 @@ int _alpm_run_chroot(alpm_handle_t *handle, const char *cmd, char *const argv[], if(pid == 0) { /* this code runs for the child only (the actual chroot/exec) */ - close(0); - close(1); - close(2); - while(dup2(child2parent_pipefd[HEAD], 1) == -1 && errno == EINTR); - while(dup2(child2parent_pipefd[HEAD], 2) == -1 && errno == EINTR); - while(dup2(parent2child_pipefd[TAIL], 0) == -1 && errno == EINTR); - close(parent2child_pipefd[TAIL]); close(parent2child_pipefd[HEAD]); close(child2parent_pipefd[TAIL]); + while(dup2(child2parent_pipefd[HEAD], STDERR_FILENO) == -1) { + if(errno != EINTR) { + /* at this point, the child cannot talk through the parent */ + exit(1); + } + } + while(dup2(parent2child_pipefd[TAIL], STDIN_FILENO) == -1) { + if(errno != EINTR) { + /* use fprintf() instead of _alpm_log() to send output through the parent */ + fprintf(stderr, _("could not redirect standard input (%s)\n"), strerror(errno)); + exit(1); + } + } + close(parent2child_pipefd[TAIL]); + while(dup2(child2parent_pipefd[HEAD], STDOUT_FILENO) == -1) { + if(errno != EINTR) { + fprintf(stderr, _("could not redirect standard output (%s)\n"), strerror(errno)); + exit(1); + } + } close(child2parent_pipefd[HEAD]); if(cwdfd >= 0) { close(cwdfd); } - /* use fprintf instead of _alpm_log to send output through the parent */ if(chroot(handle->root) != 0) { fprintf(stderr, _("could not change the root directory (%s)\n"), strerror(errno)); exit(1); } if(chdir("/") != 0) { - fprintf(stderr, _("could not change directory to %s (%s)\n"), - "/", strerror(errno)); + fprintf(stderr, _("could not change directory to %s (%s)\n"), "/", strerror(errno)); exit(1); } /* bash assumes it's being run under rsh/ssh if stdin is a socket and -- 2.33.1