[pacman-dev] [PATCH 0/2] Add Architecture information to .BUILDINFO
For reproducible builds, Foxboron and I are working on scripts to reproduce a given package file. Generating archive links to download the same exact packages that were present on the build machine is a crucial part of this. The package names have architecture information appended to them, eg. archlinux-keyring-20180302-1-any instead of archlinux-keyring-20180302-1 so to be able to reliably regenerate the package filename, architecture information is required. The proposed patches add - An additional field, 'arch', to the .BUILDINFO file - The '-$arch' suffix used in package filenames to the packages in the 'installed' array of the .BUILDINFO Robin Broda (2): Add 'arch' field to .BUILDINFO Append '-$arch' to 'installed' array in .BUILDINFO doc/BUILDINFO.5.txt | 5 ++++- scripts/makepkg.sh.in | 13 +++++++++++-- 2 files changed, 15 insertions(+), 3 deletions(-) -- 2.16.2
Signed-off-by: Robin Broda <robin@broda.me> --- doc/BUILDINFO.5.txt | 3 +++ scripts/makepkg.sh.in | 1 + 2 files changed, 4 insertions(+) diff --git a/doc/BUILDINFO.5.txt b/doc/BUILDINFO.5.txt index b7a72831..4734301e 100644 --- a/doc/BUILDINFO.5.txt +++ b/doc/BUILDINFO.5.txt @@ -44,6 +44,9 @@ BUILDINFO file format. *packager*:: The details of the packager that built the package. +*arch*:: + The architecture of the package. + *builddate*:: The build date of the package in epoch. diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index 63b6c3e1..ece53dca 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -688,6 +688,7 @@ write_buildinfo() { write_kv_pair "pkgbuild_sha256sum" $sum write_kv_pair "packager" "${PACKAGER}" + write_kv_pair "arch" "$pkgarch" write_kv_pair "builddate" "${SOURCE_DATE_EPOCH}" write_kv_pair "builddir" "${BUILDDIR}" write_kv_pair "buildenv" "${BUILDENV[@]}" -- 2.16.2
This patch incurs a **severe** performance degradation when generating the .BUILDINFO file, likely due to frequent usage of `pacman -Qi` and `grep -E`. I haven't found a faster way to gather this information. Signed-off-by: Robin Broda <robin@broda.me> --- doc/BUILDINFO.5.txt | 2 +- scripts/makepkg.sh.in | 12 ++++++++++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/doc/BUILDINFO.5.txt b/doc/BUILDINFO.5.txt index 4734301e..2c74f9ff 100644 --- a/doc/BUILDINFO.5.txt +++ b/doc/BUILDINFO.5.txt @@ -61,7 +61,7 @@ BUILDINFO file format. *installed (array)*:: The installed packages at build time including the version information of - the package. Formatted as "$pkgname-$pkgver-$pkgrel". + the package. Formatted as "$pkgname-$pkgver-$pkgrel-$pkgarch". See Also -------- diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index ece53dca..10303417 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -694,8 +694,16 @@ write_buildinfo() { write_kv_pair "buildenv" "${BUILDENV[@]}" write_kv_pair "options" "${OPTIONS[@]}" - local pkglist=($(run_pacman -Q | sed "s# #-#")) - write_kv_pair "installed" "${pkglist[@]}" + local pkglist=($(run_pacman -Qq)) + local installed=() + for pkg in "${pkglist[@]}" + do + pkginfo="$(pacman -Qi "${pkg}")" + pkgver="$(grep -E '^Version' <<< "${pkginfo}" | tr -d ' ' | cut -d':' -f2-)" + pkgarch="$(grep -E '^Architecture' <<< "${pkginfo}" | tr -d ' ' | cut -d':' -f2-)" + installed+=("${pkg}-${pkgver}-${pkgarch}") + done + write_kv_pair "installed" "${installed[@]}" } # build a sorted NUL-separated list of the full contents of the current -- 2.16.2
On 03/17/2018 05:24 PM, Robin Broda wrote:
This patch incurs a **severe** performance degradation when generating the .BUILDINFO file, likely due to frequent usage of `pacman -Qi` and `grep -E`. I haven't found a faster way to gather this information.
I'm not sure we can, without adding a dependency on e.g. expac. Which would admittedly be a lot cleaner and we could also remove the existing sed command. The downside is depending on a pretty specific tool. The plus side is ensuring that everyone has a pretty cool tool installed by default... pacman itself does not really give output meant for parsing. ... The alternative, reproduction-wise, is of course to brute force by trying to access both "x86_64" and "any" (it would suffice to check the return code of curl -IfLls -o /dev/null $url). I don't think this is a terrible option, unless Allan is willing to accept expac-based patches... -- Eli Schwartz Bug Wrangler and Trusted User
On 03/17/2018 05:24 PM, Robin Broda wrote:
Signed-off-by: Robin Broda <robin@broda.me> --- doc/BUILDINFO.5.txt | 3 +++ scripts/makepkg.sh.in | 1 + 2 files changed, 4 insertions(+)
diff --git a/doc/BUILDINFO.5.txt b/doc/BUILDINFO.5.txt index b7a72831..4734301e 100644 --- a/doc/BUILDINFO.5.txt +++ b/doc/BUILDINFO.5.txt @@ -44,6 +44,9 @@ BUILDINFO file format. *packager*:: The details of the packager that built the package.
+*arch*:: + The architecture of the package. +
I question the utility of duplicating this information in both .BUILDINFO and .PKGINFO -- Eli Schwartz Bug Wrangler and Trusted User
On 18/03/18 11:22, Eli Schwartz wrote:
On 03/17/2018 05:24 PM, Robin Broda wrote:
Signed-off-by: Robin Broda <robin@broda.me> --- doc/BUILDINFO.5.txt | 3 +++ scripts/makepkg.sh.in | 1 + 2 files changed, 4 insertions(+)
diff --git a/doc/BUILDINFO.5.txt b/doc/BUILDINFO.5.txt index b7a72831..4734301e 100644 --- a/doc/BUILDINFO.5.txt +++ b/doc/BUILDINFO.5.txt @@ -44,6 +44,9 @@ BUILDINFO file format. *packager*:: The details of the packager that built the package.
+*arch*:: + The architecture of the package. +
I question the utility of duplicating this information in both .BUILDINFO and .PKGINFO
There is already a bunch duplicated across those files. The goal was to have everything needed to build the package in a reproducible way in the .BUILDINFO file. I do question why it was output at that point, rather than under the version info where it seems much more natural. A
On 18/03/18 07:24, Robin Broda wrote:
This patch incurs a **severe** performance degradation when generating the .BUILDINFO file, likely due to frequent usage of `pacman -Qi` and `grep -E`. I haven't found a faster way to gather this information.
Signed-off-by: Robin Broda <robin@broda.me>
I will comment on the utility of this patch in another email. This is just pointing out that it is broken...
--- doc/BUILDINFO.5.txt | 2 +- scripts/makepkg.sh.in | 12 ++++++++++-- 2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/doc/BUILDINFO.5.txt b/doc/BUILDINFO.5.txt index 4734301e..2c74f9ff 100644 --- a/doc/BUILDINFO.5.txt +++ b/doc/BUILDINFO.5.txt @@ -61,7 +61,7 @@ BUILDINFO file format.
*installed (array)*:: The installed packages at build time including the version information of - the package. Formatted as "$pkgname-$pkgver-$pkgrel". + the package. Formatted as "$pkgname-$pkgver-$pkgrel-$pkgarch".
See Also -------- diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index ece53dca..10303417 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -694,8 +694,16 @@ write_buildinfo() { write_kv_pair "buildenv" "${BUILDENV[@]}" write_kv_pair "options" "${OPTIONS[@]}"
- local pkglist=($(run_pacman -Q | sed "s# #-#")) - write_kv_pair "installed" "${pkglist[@]}" + local pkglist=($(run_pacman -Qq)) + local installed=() + for pkg in "${pkglist[@]}" + do + pkginfo="$(pacman -Qi "${pkg}")"
What makes this call to pacman not need to use run_pacman like the others?
+ pkgver="$(grep -E '^Version' <<< "${pkginfo}" | tr -d ' ' | cut -d':' -f2-)" + pkgarch="$(grep -E '^Architecture' <<< "${pkginfo}" | tr -d ' ' | cut -d':' -f2-)"
Not every system runs in English...
+ installed+=("${pkg}-${pkgver}-${pkgarch}") + done + write_kv_pair "installed" "${installed[@]}" }
# build a sorted NUL-separated list of the full contents of the current
On 18/03/18 07:23, Robin Broda wrote:
For reproducible builds, Foxboron and I are working on scripts to reproduce a given package file. Generating archive links to download the same exact packages that were present on the build machine is a crucial part of this.
The package names have architecture information appended to them, eg. archlinux-keyring-20180302-1-any instead of archlinux-keyring-20180302-1 so to be able to reliably regenerate the package filename, architecture information is required.
Isn't this being handled elsewhere already - e.g. On the Debian provided reproducible service? How is that being managed without the architecture information? A
On Sun, Mar 18, 2018 at 11:52:01AM +1000, Allan McRae wrote:
On 18/03/18 07:23, Robin Broda wrote:
For reproducible builds, Foxboron and I are working on scripts to reproduce a given package file. Generating archive links to download the same exact packages that were present on the build machine is a crucial part of this.
The package names have architecture information appended to them, eg. archlinux-keyring-20180302-1-any instead of archlinux-keyring-20180302-1 so to be able to reliably regenerate the package filename, architecture information is required.
Isn't this being handled elsewhere already - e.g. On the Debian provided reproducible service? How is that being managed without the architecture information?
A
Their service doesn't care as it tracks SVN directly when building packages. However, when we provide repro tools for our users we need to recreate with `.BUILDINFO`. So we have to pull down packages from the archive. Jelle suggested we just bruteforce it for the time being. But that sucks a little bit. Tool in question: https://github.com/Foxboron/devtools-repro -- Morten Linderud PGP: 9C02FF419FECBE16
On 03/18/18 at 02:56am, Morten Linderud wrote:
On Sun, Mar 18, 2018 at 11:52:01AM +1000, Allan McRae wrote:
On 18/03/18 07:23, Robin Broda wrote:
For reproducible builds, Foxboron and I are working on scripts to reproduce a given package file. Generating archive links to download the same exact packages that were present on the build machine is a crucial part of this.
The package names have architecture information appended to them, eg. archlinux-keyring-20180302-1-any instead of archlinux-keyring-20180302-1 so to be able to reliably regenerate the package filename, architecture information is required.
Isn't this being handled elsewhere already - e.g. On the Debian provided reproducible service? How is that being managed without the architecture information?
A
Their service doesn't care as it tracks SVN directly when building packages. However, when we provide repro tools for our users we need to recreate with `.BUILDINFO`. So we have to pull down packages from the archive. Jelle suggested we just bruteforce it for the time being. But that sucks a little bit.
Tool in question: https://github.com/Foxboron/devtools-repro
alpm does not limit the package architecture; it can contain '-'. If the user has such a package installed, this will result in entries that will be parsed incorrectly.
On Sun, Mar 18, 2018 at 11:40:53AM +1000, Allan McRae wrote:
On 18/03/18 07:24, Robin Broda wrote:
This patch incurs a **severe** performance degradation when generating the .BUILDINFO file, likely due to frequent usage of `pacman -Qi` and `grep -E`. I haven't found a faster way to gather this information.
Signed-off-by: Robin Broda <robin@broda.me>
I will comment on the utility of this patch in another email. This is just pointing out that it is broken...
--- doc/BUILDINFO.5.txt | 2 +- scripts/makepkg.sh.in | 12 ++++++++++-- 2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/doc/BUILDINFO.5.txt b/doc/BUILDINFO.5.txt index 4734301e..2c74f9ff 100644 --- a/doc/BUILDINFO.5.txt +++ b/doc/BUILDINFO.5.txt @@ -61,7 +61,7 @@ BUILDINFO file format.
*installed (array)*:: The installed packages at build time including the version information of - the package. Formatted as "$pkgname-$pkgver-$pkgrel". + the package. Formatted as "$pkgname-$pkgver-$pkgrel-$pkgarch".
See Also -------- diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index ece53dca..10303417 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -694,8 +694,16 @@ write_buildinfo() { write_kv_pair "buildenv" "${BUILDENV[@]}" write_kv_pair "options" "${OPTIONS[@]}"
- local pkglist=($(run_pacman -Q | sed "s# #-#")) - write_kv_pair "installed" "${pkglist[@]}" + local pkglist=($(run_pacman -Qq)) + local installed=() + for pkg in "${pkglist[@]}" + do + pkginfo="$(pacman -Qi "${pkg}")"
What makes this call to pacman not need to use run_pacman like the others?
Answer: run_pacman calls sudo, which means that a bare 'makepkg' will require elevated privileges.
+ pkgver="$(grep -E '^Version' <<< "${pkginfo}" | tr -d ' ' | cut -d':' -f2-)" + pkgarch="$(grep -E '^Architecture' <<< "${pkginfo}" | tr -d ' ' | cut -d':' -f2-)"
Not every system runs in English...
+ installed+=("${pkg}-${pkgver}-${pkgarch}") + done + write_kv_pair "installed" "${installed[@]}" }
# build a sorted NUL-separated list of the full contents of the current
On 05/15/2018 04:46 PM, Dave Reisner wrote:
On Sun, Mar 18, 2018 at 11:40:53AM +1000, Allan McRae wrote:
What makes this call to pacman not need to use run_pacman like the others?
Answer: run_pacman calls sudo, which means that a bare 'makepkg' will require elevated privileges.
Looking at 5698d7b66daa2a0bc99cab7a989cef1c806c3bf6 (1), `run_pacman` was already used for a `pacman -Q` prior to this patch, and looking at makepkg.sh.in l221 (2) it appears like run_pacman is currently whitelisting a handful of options only. This match should probably be improved (1) https://git.archlinux.org/pacman.git/commit/?id=5698d7b66daa2a0bc99cab7a989c... (2) https://git.archlinux.org/pacman.git/tree/scripts/makepkg.sh.in?id=HEAD#n221 -- Rob (coderobe) O< ascii ribbon campaign - stop html mail - www.asciiribbon.org
In commit 5698d7b66daa2a0bc99cab7a989cef1c806c3bf6 a new non-root use of pacman was added -- previously we used -T or -Qq, and run_pacman did not know how to special-case -Qi to skip being prepended with sudo. The result is: -> Generating .BUILDINFO file... ERROR: ld.so: object 'libfakeroot.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored. [sudo] password for eschwartz: -> Adding changelog file... Fix this by using a more generic glob since neither -Q nor -T will ever need sudo or PACMAN_OPTS Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- scripts/makepkg.sh.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index 62b912e3..e9080a70 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -220,12 +220,12 @@ missing_source_file() { run_pacman() { local cmd - if [[ $1 != -@(T|Qq) ]]; then + if [[ $1 != -@(T|Q)*([[:alpha:]]) ]]; then cmd=("$PACMAN_PATH" "${PACMAN_OPTS[@]}" "$@") else cmd=("$PACMAN_PATH" "$@") fi - if [[ $1 != -@(T|Qq|Q) ]]; then + if [[ $1 != -@(T|Q)*([[:alpha:]]) ]]; then if type -p sudo >/dev/null; then cmd=(sudo "${cmd[@]}") else -- 2.17.0
On Tue, May 15, 2018 at 11:13:09AM -0400, Eli Schwartz wrote:
In commit 5698d7b66daa2a0bc99cab7a989cef1c806c3bf6 a new non-root use of pacman was added -- previously we used -T or -Qq, and run_pacman did not know how to special-case -Qi to skip being prepended with sudo.
Can we just be explicit about when we do and don't need elevated privileges rather than trying to guess? Surely the caller knows the requirements a priori.
The result is:
-> Generating .BUILDINFO file... ERROR: ld.so: object 'libfakeroot.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored. [sudo] password for eschwartz: -> Adding changelog file...
Fix this by using a more generic glob since neither -Q nor -T will ever need sudo or PACMAN_OPTS
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- scripts/makepkg.sh.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index 62b912e3..e9080a70 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -220,12 +220,12 @@ missing_source_file() {
run_pacman() { local cmd - if [[ $1 != -@(T|Qq) ]]; then + if [[ $1 != -@(T|Q)*([[:alpha:]]) ]]; then cmd=("$PACMAN_PATH" "${PACMAN_OPTS[@]}" "$@") else cmd=("$PACMAN_PATH" "$@") fi - if [[ $1 != -@(T|Qq|Q) ]]; then + if [[ $1 != -@(T|Q)*([[:alpha:]]) ]]; then if type -p sudo >/dev/null; then cmd=(sudo "${cmd[@]}") else -- 2.17.0
On 16/05/18 01:46, Dave Reisner wrote:
On Tue, May 15, 2018 at 11:13:09AM -0400, Eli Schwartz wrote:
In commit 5698d7b66daa2a0bc99cab7a989cef1c806c3bf6 a new non-root use of pacman was added -- previously we used -T or -Qq, and run_pacman did not know how to special-case -Qi to skip being prepended with sudo.
Can we just be explicit about when we do and don't need elevated privileges rather than trying to guess? Surely the caller knows the requirements a priori.
Are you thinking two functions? One for root, one not? Or a helper function as_root() that does the su/sudo part? Or a function parameter for root usage? Either way, I think the current patch is fine for 5.1. Bigger change can happen later. A
On 05/16/2018 12:41 AM, Allan McRae wrote:
On 16/05/18 01:46, Dave Reisner wrote:
On Tue, May 15, 2018 at 11:13:09AM -0400, Eli Schwartz wrote:
In commit 5698d7b66daa2a0bc99cab7a989cef1c806c3bf6 a new non-root use of pacman was added -- previously we used -T or -Qq, and run_pacman did not know how to special-case -Qi to skip being prepended with sudo.
Can we just be explicit about when we do and don't need elevated privileges rather than trying to guess? Surely the caller knows the requirements a priori.
Are you thinking two functions? One for root, one not? Or a helper function as_root() that does the su/sudo part? Or a function parameter for root usage?
Either way, I think the current patch is fine for 5.1. Bigger change can happen later.
My patch keeps to the spirit of how things are currently done, but for the future, we should probably consider why it's designed the way it is now. The function does three things: - resolve "pacman" to $PACMAN - interpolate options like --noprogressbar, --noconfirm, and --color= - heuristically guess whether to try elevating to root. #3 is currently broken. #2 is interesting because we sometimes don't do it. Is there a reason for this? dreisner initially suggested something along the lines of https://paste.xinu.at/ldn/ with two functions, though as I pointed out then, the function should be renamed to not clash with "pacman" itself. An as_root helper should not be necessary, we don't need sudo for anything except pacman syncing deps or installing built artifacts, so we'd still need to have a run_pacman for #'s 1 & 2, which is quite indirect. I don't favor the parameter-based approach... seems awkward to call it. -- Eli Schwartz Bug Wrangler and Trusted User
participants (7)
-
Allan McRae
-
Andrew Gregory
-
Dave Reisner
-
Eli Schwartz
-
Morten Linderud
-
Robin Broda
-
Robin Broda