[pacman-dev] [PATCH] makepkg: Include more source files in debug packages.

Austin Lund austin.lund at gmail.com
Fri Feb 21 01:47:04 UTC 2020

On Thu, 20 Feb 2020 at 13:41, Eli Schwartz <eschwartz at archlinux.org> wrote:
> On 2/19/20 8:51 PM, Austin Lund wrote:
> > Currently only the file pointed to by the DW_AT_name is included as a source
> > file in debug packages.  This means many files that are useful for debugging are
> > not included.  For example, no header files are included but yet these may by
> > referenced in the .debug_line section.
> Why do we need headers? headers by definition are supposed to declare
> things that are then defined by a source file which is in DW_AT_name.

Some function definitions are presented within header files.  For
example, people will declare short inline functions in header files.
The line debugging information then points into the header files.

Just a random example:

gdb ls
b initialize_exit_failure
r does_not_exist

A code listing in GDB should point to the inline function definition
in system.h in coreutils  But currently that's missing from a built
coreutils-debug package.

> > This sed script converts into shell variables the debug dump information from
> > readelf about compilation units, directory tables and file tables.  This can be
> > used to get the full path of all the source files from within the package being
> > compiled that are referenced in the debugging information.  Also, placeholder
> > symbols (e.g. <builtin>) and paths outside the current source (e.g. linked
> > libraries) will be more consistently ignored from inclusion in the debug
> > packages.
> I... don't really follow what this sed thing is doing. Is it printing
> out a bash script? Why?

A fairly complex state needs to be saved to give a output useful for
packaging.  The things that are needed are the compilation directory
for each compilation unit and the file and directory tables for the
corresponding compilation units.  The way this is structured in the
file and the way readelf presents this information isn't totally
linear.  DW_AT_comp_dir is in the .debug_info section whilst the file
and directory tables are in the .debug_line section where the file
table refers to the directory table.  readelf tries to resolve things
with the --debug-dump=decodedline option.  But the compilation
directories are lost and the directory tables aren't used consistently
in the output (at least in my version of readelf (i.e. 2.34).

Keeping track of this state information is completely impractical in
sed, and confusing in awk.  Using bash associative arrays was an easy
solution that came to mind, so I just used that and parsed the output
in a subshell.  Perhaps there is a better way.

More information about the pacman-dev mailing list