[PATCH 0/1] Use debugedit instead of AWK to parse source files
From: Morten Linderud <morten@linderud.pw> This utilizes the debugedit project which was split out from RPM fairly recently. It's been used in things like portage and hopefully gives us better support for debug packages in other languages then just C/C++ https://sourceware.org/debugedit/ I have tested this one a Go project and compared with the previously generated pacman debug packages. Everything seems fine and gdb works for C, and delve works for the Go project. Preferaly some more testing would be needed. Morten Linderud (1): strip: Use debugedit instead of AWK to parse source files scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) -- 2.34.1
From: Morten Linderud <morten@linderud.pw> This moves us from the fairly ugly AWK parsing line to debugedit which originally comes out of the rpm project. The original code has issues parsing anything that was not straight C/C++ and languages like Rust or Go would return invalid source code files. debugedit handles all these cases better. Signed-off-by: Morten Linderud <morten@linderud.pw> --- scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { } source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" + LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null + sort -zu "${dbgsrclist}" | tr '\0' '\n' + rm -f "$dbgsrclist" } strip_file() { @@ -58,9 +61,9 @@ strip_file() { # copy source files to debug directory local file dest t while IFS= read -r t; do - file=${t/${dbgsrcdir}/"$srcdir"} - dest="${dbgsrc/"$dbgsrcdir"/}$t" - if ! [[ -f $dest ]]; then + file="${srcdir}/${t}" + dest="${dbgsrc}/${t}" + if [[ -f "$file" ]] && ! [[ -f $dest ]]; then mkdir -p "${dest%/*}" cp -- "$file" "$dest" fi -- 2.34.1
On 2/1/22 10:28, Morten Linderud wrote:
From: Morten Linderud <morten@linderud.pw>
This moves us from the fairly ugly AWK parsing line to debugedit which originally comes out of the rpm project.
The original code has issues parsing anything that was not straight C/C++ and languages like Rust or Go would return invalid source code files. debugedit handles all these cases better.
Signed-off-by: Morten Linderud <morten@linderud.pw> ---
I'm really happy this was split into a separate project! The original plan was to use debugedit, but it could not be build without (from memory) depending on librpm, which did not seem appropriate! Add a check for debugedit in scripts/libmakepkg/executable/strip.sh.in
scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { }
source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")"
I really do not like making temporary files. Particularly not in ${startdir}, which can be readonly provided BUILDDIR, SRCDIR, etc are set. I may accept using $srcdir if writing this to a file is *essential*.
+ LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null + sort -zu "${dbgsrclist}" | tr '\0' '\n' > + rm -f "$dbgsrclist" }
strip_file() { @@ -58,9 +61,9 @@ strip_file() { # copy source files to debug directory local file dest t while IFS= read -r t; do - file=${t/${dbgsrcdir}/"$srcdir"} - dest="${dbgsrc/"$dbgsrcdir"/}$t" - if ! [[ -f $dest ]]; then + file="${srcdir}/${t}" + dest="${dbgsrc}/${t}" + if [[ -f "$file" ]] && ! [[ -f $dest ]]; then mkdir -p "${dest%/*}" cp -- "$file" "$dest" fi
On 02/01/2022 06.08, Allan McRae wrote:
On 2/1/22 10:28, Morten Linderud wrote:
scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { } source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")"
I really do not like making temporary files. Particularly not in ${startdir}, which can be readonly provided BUILDDIR, SRCDIR, etc are set.
I may accept using $srcdir if writing this to a file is *essential*>
+ LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null + sort -zu "${dbgsrclist}" | tr '\0' '\n' + rm -f "$dbgsrclist"
I haven't been able to produce any unrelated output on stdout, so just using /dev/stdout for --list-file would probably work ok - to make extra sure, this also works: debugedit --list-file=/dev/fd/3 "$1" 3>&1 >/dev/null | sort [...] Tangentially related, what's the opinion on using short vs. long options? Personally, I find that long options make scripts much easier to grok because they save a lot of looking at man pages/--help outputs, but I can also see how they might make things too crowded and busy, especially for common commands. Either way, might be something to include in HACKING.
} strip_file() { @@ -58,9 +61,9 @@ strip_file() { # copy source files to debug directory local file dest t while IFS= read -r t; do - file=${t/${dbgsrcdir}/"$srcdir"} - dest="${dbgsrc/"$dbgsrcdir"/}$t" - if ! [[ -f $dest ]]; then + file="${srcdir}/${t}" + dest="${dbgsrc}/${t}" + if [[ -f "$file" ]] && ! [[ -f $dest ]]; then mkdir -p "${dest%/*}" cp -- "$file" "$dest" fi
On 02/01/2022 01.28, Morten Linderud wrote:
From: Morten Linderud <morten@linderud.pw>
This moves us from the fairly ugly AWK parsing line to debugedit which originally comes out of the rpm project.
The original code has issues parsing anything that was not straight C/C++ and languages like Rust or Go would return invalid source code files. debugedit handles all these cases better.
Signed-off-by: Morten Linderud <morten@linderud.pw> --- scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { }
source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" + LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null
-d/--dest-dir actually causes paths embedded in the binary to be *rewritten* from -b/--base-dir. Normally, this shouldn't happen because all paths are already translated by -fdebug-prefix-map before they end up in the binary (making it equivalent to `debugedit --base-dir "${dbgsrcdir}"`) but if there are any $srcdir-based paths left for whatever reason, this modifies the binary. If this is something we actually want to do here, I think this behaviour is at least worth a source comment.
+ sort -zu "${dbgsrclist}" | tr '\0' '\n' + rm -f "$dbgsrclist" }
strip_file() { @@ -58,9 +61,9 @@ strip_file() { # copy source files to debug directory local file dest t while IFS= read -r t; do - file=${t/${dbgsrcdir}/"$srcdir"} - dest="${dbgsrc/"$dbgsrcdir"/}$t" - if ! [[ -f $dest ]]; then + file="${srcdir}/${t}" + dest="${dbgsrc}/${t}" + if [[ -f "$file" ]] && ! [[ -f $dest ]]; then mkdir -p "${dest%/*}" cp -- "$file" "$dest" fi
On 2/1/22 23:34, Xiretza wrote:
On 02/01/2022 01.28, Morten Linderud wrote:
From: Morten Linderud <morten@linderud.pw>
This moves us from the fairly ugly AWK parsing line to debugedit which originally comes out of the rpm project.
The original code has issues parsing anything that was not straight C/C++ and languages like Rust or Go would return invalid source code files. debugedit handles all these cases better.
Signed-off-by: Morten Linderud <morten@linderud.pw> --- scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { } source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" + LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null
-d/--dest-dir actually causes paths embedded in the binary to be *rewritten* from -b/--base-dir. Normally, this shouldn't happen because all paths are already translated by -fdebug-prefix-map before they end up in the binary (making it equivalent to `debugedit --base-dir "${dbgsrcdir}"`) but if there are any $srcdir-based paths left for whatever reason, this modifies the binary.
If this is something we actually want to do here, I think this behaviour is at least worth a source comment.
Does it? During testing the patch I added checksums before and after getting the source file list. This command causes no change to the file. A
On 02/01/2022 14.44, Allan McRae wrote:
On 2/1/22 23:34, Xiretza wrote:
On 02/01/2022 01.28, Morten Linderud wrote:
From: Morten Linderud <morten@linderud.pw>
This moves us from the fairly ugly AWK parsing line to debugedit which originally comes out of the rpm project.
The original code has issues parsing anything that was not straight C/C++ and languages like Rust or Go would return invalid source code files. debugedit handles all these cases better.
Signed-off-by: Morten Linderud <morten@linderud.pw> --- scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { } source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" + LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null
-d/--dest-dir actually causes paths embedded in the binary to be *rewritten* from -b/--base-dir. Normally, this shouldn't happen because all paths are already translated by -fdebug-prefix-map before they end up in the binary (making it equivalent to `debugedit --base-dir "${dbgsrcdir}"`) but if there are any $srcdir-based paths left for whatever reason, this modifies the binary.
If this is something we actually want to do here, I think this behaviour is at least worth a source comment.
Does it? During testing the patch I added checksums before and after getting the source file list. This command causes no change to the file.
A
It does for me: $ pwd /tmp/srcdir $ cat main.c int main() {return 0;} $ gcc -g main.c $ debugedit -l /dev/stdout a.out | tr '\0' '\n' /tmp/srcdir/main.c /tmp/srcdir/main.c $ sha256sum a.out 893824faaf62af0cdd5c23883e45b0bd6f1f526fbe5aa89d357c9acd34bbce3e a.out $ debugedit -l /dev/stdout --base-dir /tmp/srcdir --dest-dir /tmp/dbgsrcdir a.out | tr '\0' '\n' main.c main.c $ sha256sum a.out 6e4eebd6863150cb6568d99335ff073a5ddd9585e22c5de5c4dd159e5b6bc0b2 a.out $ debugedit -l /dev/stdout a.out | tr '\0' '\n' /tmp/dbgsrcdir/main.c /tmp/dbgsrcdir/main.c
On 2/1/22 23:51, Xiretza wrote:
On 02/01/2022 14.44, Allan McRae wrote:
On 2/1/22 23:34, Xiretza wrote:
On 02/01/2022 01.28, Morten Linderud wrote:
From: Morten Linderud <morten@linderud.pw>
This moves us from the fairly ugly AWK parsing line to debugedit which originally comes out of the rpm project.
The original code has issues parsing anything that was not straight C/C++ and languages like Rust or Go would return invalid source code files. debugedit handles all these cases better.
Signed-off-by: Morten Linderud <morten@linderud.pw> --- scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { } source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" + LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null
-d/--dest-dir actually causes paths embedded in the binary to be *rewritten* from -b/--base-dir. Normally, this shouldn't happen because all paths are already translated by -fdebug-prefix-map before they end up in the binary (making it equivalent to `debugedit --base-dir "${dbgsrcdir}"`) but if there are any $srcdir-based paths left for whatever reason, this modifies the binary.
If this is something we actually want to do here, I think this behaviour is at least worth a source comment.
Does it? During testing the patch I added checksums before and after getting the source file list. This command causes no change to the file.
A
It does for me:
$ pwd /tmp/srcdir $ cat main.c int main() {return 0;} $ gcc -g main.c $ debugedit -l /dev/stdout a.out | tr '\0' '\n' /tmp/srcdir/main.c /tmp/srcdir/main.c
$ sha256sum a.out 893824faaf62af0cdd5c23883e45b0bd6f1f526fbe5aa89d357c9acd34bbce3e a.out $ debugedit -l /dev/stdout --base-dir /tmp/srcdir --dest-dir /tmp/dbgsrcdir a.out | tr '\0' '\n' main.c main.c $ sha256sum a.out 6e4eebd6863150cb6568d99335ff073a5ddd9585e22c5de5c4dd159e5b6bc0b2 a.out
$ debugedit -l /dev/stdout a.out | tr '\0' '\n' /tmp/dbgsrcdir/main.c /tmp/dbgsrcdir/main.c .
Add -n. Allan
On 3/1/22 00:20, Allan McRae wrote:
On 2/1/22 23:51, Xiretza wrote:
On 02/01/2022 14.44, Allan McRae wrote:
On 2/1/22 23:34, Xiretza wrote:
On 02/01/2022 01.28, Morten Linderud wrote:
From: Morten Linderud <morten@linderud.pw>
This moves us from the fairly ugly AWK parsing line to debugedit which originally comes out of the rpm project.
The original code has issues parsing anything that was not straight C/C++ and languages like Rust or Go would return invalid source code files. debugedit handles all these cases better.
Signed-off-by: Morten Linderud <morten@linderud.pw> --- scripts/libmakepkg/tidy/strip.sh.in | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/scripts/libmakepkg/tidy/strip.sh.in b/scripts/libmakepkg/tidy/strip.sh.in index 92a6fb15..c1d8ee3c 100644 --- a/scripts/libmakepkg/tidy/strip.sh.in +++ b/scripts/libmakepkg/tidy/strip.sh.in @@ -36,8 +36,11 @@ build_id() { } source_files() { - LANG=C readelf "$1" --debug-dump 2>/dev/null | \ - awk '/DW_AT_name +:/{name=$NF}/DW_AT_comp_dir +:/{{if (name == "<artificial>") next}{if (name !~ /^[<\/]/) {printf "%s/", $NF}}{print name}}' + dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" + local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" + LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null
-d/--dest-dir actually causes paths embedded in the binary to be *rewritten* from -b/--base-dir. Normally, this shouldn't happen because all paths are already translated by -fdebug-prefix-map before they end up in the binary (making it equivalent to `debugedit --base-dir "${dbgsrcdir}"`) but if there are any $srcdir-based paths left for whatever reason, this modifies the binary.
If this is something we actually want to do here, I think this behaviour is at least worth a source comment.
Does it? During testing the patch I added checksums before and after getting the source file list. This command causes no change to the file.
A
It does for me:
$ pwd /tmp/srcdir $ cat main.c int main() {return 0;} $ gcc -g main.c $ debugedit -l /dev/stdout a.out | tr '\0' '\n' /tmp/srcdir/main.c /tmp/srcdir/main.c
$ sha256sum a.out 893824faaf62af0cdd5c23883e45b0bd6f1f526fbe5aa89d357c9acd34bbce3e a.out $ debugedit -l /dev/stdout --base-dir /tmp/srcdir --dest-dir /tmp/dbgsrcdir a.out | tr '\0' '\n' main.c main.c $ sha256sum a.out 6e4eebd6863150cb6568d99335ff073a5ddd9585e22c5de5c4dd159e5b6bc0b2 a.out
$ debugedit -l /dev/stdout a.out | tr '\0' '\n' /tmp/dbgsrcdir/main.c /tmp/dbgsrcdir/main.c .
Add -n.
For a better example, here is my testing code: source_files() { dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" echo $1 >> $startdir/dbginfo echo sha256sum-orig: $(sha256sum $1) >> $startdir/dbginfo LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null sort -zu "${dbgsrclist}" | tr '\0' '\n' sort -zu "${dbgsrclist}" | tr '\0' '\n' >> $startdir/dbginfo echo sha256sum-after: $(sha256sum $1) >> $startdir/dbginfo rm -f "$dbgsrclist" } and a snippet of the output: ./usr/bin/vercmp sha256sum-orig: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp pacman/builddir/<artificial> pacman/builddir/<built-in> pacman/lib/libalpm/version.c pacman/src/util/vercmp.c sha256sum-after: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp
On 02/01/2022 15.24, Allan McRae wrote:
On 3/1/22 00:20, Allan McRae wrote:
Add -n.
No change.
For a better example, here is my testing code:
source_files() { dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" echo $1 >> $startdir/dbginfo echo sha256sum-orig: $(sha256sum $1) >> $startdir/dbginfo LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null sort -zu "${dbgsrclist}" | tr '\0' '\n' sort -zu "${dbgsrclist}" | tr '\0' '\n' >> $startdir/dbginfo echo sha256sum-after: $(sha256sum $1) >> $startdir/dbginfo rm -f "$dbgsrclist" }
and a snippet of the output:
./usr/bin/vercmp sha256sum-orig: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp pacman/builddir/<artificial> pacman/builddir/<built-in> pacman/lib/libalpm/version.c pacman/src/util/vercmp.c sha256sum-after: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp
Is it possible that there are simply no source file entries referencing $srcdir (because -fdebug-prefix-map is working as expected)? As I said, if that's the case, the binary is not modified because there are no occurrences of $srcdir to be rewritten to $dbgsrcdir.
On 3/1/22 00:29, Xiretza wrote:
On 02/01/2022 15.24, Allan McRae wrote:
On 3/1/22 00:20, Allan McRae wrote:
Add -n.
No change.
For a better example, here is my testing code:
source_files() { dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" echo $1 >> $startdir/dbginfo echo sha256sum-orig: $(sha256sum $1) >> $startdir/dbginfo LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null sort -zu "${dbgsrclist}" | tr '\0' '\n' sort -zu "${dbgsrclist}" | tr '\0' '\n' >> $startdir/dbginfo echo sha256sum-after: $(sha256sum $1) >> $startdir/dbginfo rm -f "$dbgsrclist" }
and a snippet of the output:
./usr/bin/vercmp sha256sum-orig: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp pacman/builddir/<artificial> pacman/builddir/<built-in> pacman/lib/libalpm/version.c pacman/src/util/vercmp.c sha256sum-after: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp
Is it possible that there are simply no source file entries referencing $srcdir (because -fdebug-prefix-map is working as expected)? As I said, if that's the case, the binary is not modified because there are no occurrences of $srcdir to be rewritten to $dbgsrcdir.
That is possible - I mostly tested packages I know obey CFLAGS. Saying that, any change would be a good thing, or else those source files placed in ${dbgsrcdir} are kindof useless! So I'm happy to have this rewrite paths - we are not regenerating the build-id, so I'm assuming this does not cause package reproducibility issues... Also, I get a very different list of files with and without -b/-d. Without using them I get a lot of system files. With, I just get the package source files. A
On Mon, Jan 03, 2022 at 12:48:05AM +1000, Allan McRae wrote:
On 3/1/22 00:29, Xiretza wrote:
On 02/01/2022 15.24, Allan McRae wrote:
On 3/1/22 00:20, Allan McRae wrote:
Add -n.
No change.
For a better example, here is my testing code:
source_files() { dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" echo $1 >> $startdir/dbginfo echo sha256sum-orig: $(sha256sum $1) >> $startdir/dbginfo LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null sort -zu "${dbgsrclist}" | tr '\0' '\n' sort -zu "${dbgsrclist}" | tr '\0' '\n' >> $startdir/dbginfo echo sha256sum-after: $(sha256sum $1) >> $startdir/dbginfo rm -f "$dbgsrclist" }
and a snippet of the output:
./usr/bin/vercmp sha256sum-orig: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp pacman/builddir/<artificial> pacman/builddir/<built-in> pacman/lib/libalpm/version.c pacman/src/util/vercmp.c sha256sum-after: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp
Is it possible that there are simply no source file entries referencing $srcdir (because -fdebug-prefix-map is working as expected)? As I said, if that's the case, the binary is not modified because there are no occurrences of $srcdir to be rewritten to $dbgsrcdir.
That is possible - I mostly tested packages I know obey CFLAGS. Saying that, any change would be a good thing, or else those source files placed in ${dbgsrcdir} are kindof useless! So I'm happy to have this rewrite paths - we are not regenerating the build-id, so I'm assuming this does not cause package reproducibility issues...
Also, I get a very different list of files with and without -b/-d. Without using them I get a lot of system files. With, I just get the package source files.
A
Xiretza is correct. It will rewrite the Go binaries and add correct paths since we do not have a prefix-map option in the Go compiler. However it won't change anything that utilizies prefix-map correctly. There might be a catch here, but I haven't found it. -- Morten Linderud PGP: 9C02FF419FECBE16
On 02/01/2022 15.48, Allan McRae wrote:
On 3/1/22 00:29, Xiretza wrote:
On 02/01/2022 15.24, Allan McRae wrote:
On 3/1/22 00:20, Allan McRae wrote:
Add -n.
No change.
For a better example, here is my testing code:
source_files() { dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" echo $1 >> $startdir/dbginfo echo sha256sum-orig: $(sha256sum $1) >> $startdir/dbginfo LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null sort -zu "${dbgsrclist}" | tr '\0' '\n' sort -zu "${dbgsrclist}" | tr '\0' '\n' >> $startdir/dbginfo echo sha256sum-after: $(sha256sum $1) >> $startdir/dbginfo rm -f "$dbgsrclist" }
and a snippet of the output:
./usr/bin/vercmp sha256sum-orig: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp pacman/builddir/<artificial> pacman/builddir/<built-in> pacman/lib/libalpm/version.c pacman/src/util/vercmp.c sha256sum-after: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp
Is it possible that there are simply no source file entries referencing $srcdir (because -fdebug-prefix-map is working as expected)? As I said, if that's the case, the binary is not modified because there are no occurrences of $srcdir to be rewritten to $dbgsrcdir.
That is possible - I mostly tested packages I know obey CFLAGS. Saying that, any change would be a good thing, or else those source files placed in ${dbgsrcdir} are kindof useless! So I'm happy to have this rewrite paths - we are not regenerating the build-id, so I'm assuming this does not cause package reproducibility issues...
Makes sense, I can't think of any issues this would cause either - the source file paths have to be deterministic anyway, applying a deterministic transform on top won't change that. Still, having a function called "source_files" actually modify the passed binary deserves a comment, I think.
Also, I get a very different list of files with and without -b/-d. Without using them I get a lot of system files. With, I just get the package source files.
Right, without -b/-d, it just lists all source files. With just -b, it prints all source files rooted in the specified directory, but strips the prefix from the output. With both -b and -d, it replaces any base-dir prefixes with dest-dir (modifying the binary), then prints all paths rooted in the dest-dir (again, stripping the dest-dir prefix in the output), regardless if they've always been rooted in dest-dir (due to -fdebug-prefix-map) or if they were just transformed from base-dir. I hope that made sense.
A
On 3/1/22 01:00, Xiretza wrote:
On 02/01/2022 15.48, Allan McRae wrote:
On 3/1/22 00:29, Xiretza wrote:
On 02/01/2022 15.24, Allan McRae wrote:
On 3/1/22 00:20, Allan McRae wrote:
Add -n.
No change.
For a better example, here is my testing code:
source_files() { dbgsrcdir="${DBGSRCDIR:-/usr/src/debug}" local dbgsrclist="$(mktemp "${startdir}/dbgsrclist.${binary##*/}.XXXXXXXXX")" echo $1 >> $startdir/dbginfo echo sha256sum-orig: $(sha256sum $1) >> $startdir/dbginfo LANG=C debugedit -n -b "${srcdir}" -d "${dbgsrcdir}" -l "${dbgsrclist}" "$1" > /dev/null sort -zu "${dbgsrclist}" | tr '\0' '\n' sort -zu "${dbgsrclist}" | tr '\0' '\n' >> $startdir/dbginfo echo sha256sum-after: $(sha256sum $1) >> $startdir/dbginfo rm -f "$dbgsrclist" }
and a snippet of the output:
./usr/bin/vercmp sha256sum-orig: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp pacman/builddir/<artificial> pacman/builddir/<built-in> pacman/lib/libalpm/version.c pacman/src/util/vercmp.c sha256sum-after: 844e2a18277df5d46544fc977a028b02b58d642bc9754d7d9868197d23f42407 ./usr/bin/vercmp
Is it possible that there are simply no source file entries referencing $srcdir (because -fdebug-prefix-map is working as expected)? As I said, if that's the case, the binary is not modified because there are no occurrences of $srcdir to be rewritten to $dbgsrcdir.
That is possible - I mostly tested packages I know obey CFLAGS. Saying that, any change would be a good thing, or else those source files placed in ${dbgsrcdir} are kindof useless! So I'm happy to have this rewrite paths - we are not regenerating the build-id, so I'm assuming this does not cause package reproducibility issues...
Makes sense, I can't think of any issues this would cause either - the source file paths have to be deterministic anyway, applying a deterministic transform on top won't change that.
Still, having a function called "source_files" actually modify the passed binary deserves a comment, I think.
Alternatively, happy for the function to be renamed something better. A
On Sun, Jan 02, 2022 at 04:00:56PM +0100, Xiretza wrote:
On 02/01/2022 15.48, Allan McRae wrote:
Also, I get a very different list of files with and without -b/-d. Without using them I get a lot of system files. With, I just get the package source files.
Right, without -b/-d, it just lists all source files. With just -b, it prints all source files rooted in the specified directory, but strips the prefix from the output. With both -b and -d, it replaces any base-dir prefixes with dest-dir (modifying the binary), then prints all paths rooted in the dest-dir (again, stripping the dest-dir prefix in the output), regardless if they've always been rooted in dest-dir (due to -fdebug-prefix-map) or if they were just transformed from base-dir.
I hope that made sense.
We can have one function `rewrite_source_list` and one `source_list`? `rewrite_source_list` would do the explicit rewrite, if needed. `source_list` just uses --base-dir and returns the files. -- Morten Linderud PGP: 9C02FF419FECBE16
On 02/01/2022 16.18, Morten Linderud wrote:
On Sun, Jan 02, 2022 at 04:00:56PM +0100, Xiretza wrote:
On 02/01/2022 15.48, Allan McRae wrote:
Also, I get a very different list of files with and without -b/-d. Without using them I get a lot of system files. With, I just get the package source files.
Right, without -b/-d, it just lists all source files. With just -b, it prints all source files rooted in the specified directory, but strips the prefix from the output. With both -b and -d, it replaces any base-dir prefixes with dest-dir (modifying the binary), then prints all paths rooted in the dest-dir (again, stripping the dest-dir prefix in the output), regardless if they've always been rooted in dest-dir (due to -fdebug-prefix-map) or if they were just transformed from base-dir.
I hope that made sense.
We can have one function `rewrite_source_list` and one `source_list`?
`rewrite_source_list` would do the explicit rewrite, if needed. `source_list` just uses --base-dir and returns the files.
How would it know which one to run? I don't think it's worth the hassle of figuring that out, it's probably fine as-is.
participants (3)
-
Allan McRae
-
Morten Linderud
-
Xiretza