On Thu, 20 Feb 2020 at 13:41, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 2/19/20 8:51 PM, Austin Lund wrote:
Currently only the file pointed to by the DW_AT_name is included as a source file in debug packages. This means many files that are useful for debugging are not included. For example, no header files are included but yet these may by referenced in the .debug_line section.
Why do we need headers? headers by definition are supposed to declare things that are then defined by a source file which is in DW_AT_name.
Some function definitions are presented within header files. For example, people will declare short inline functions in header files. The line debugging information then points into the header files. Just a random example: gdb ls b initialize_exit_failure r does_not_exist A code listing in GDB should point to the inline function definition in system.h in coreutils But currently that's missing from a built coreutils-debug package.
This sed script converts into shell variables the debug dump information from readelf about compilation units, directory tables and file tables. This can be used to get the full path of all the source files from within the package being compiled that are referenced in the debugging information. Also, placeholder symbols (e.g. <builtin>) and paths outside the current source (e.g. linked libraries) will be more consistently ignored from inclusion in the debug packages.
I... don't really follow what this sed thing is doing. Is it printing out a bash script? Why?
A fairly complex state needs to be saved to give a output useful for packaging. The things that are needed are the compilation directory for each compilation unit and the file and directory tables for the corresponding compilation units. The way this is structured in the file and the way readelf presents this information isn't totally linear. DW_AT_comp_dir is in the .debug_info section whilst the file and directory tables are in the .debug_line section where the file table refers to the directory table. readelf tries to resolve things with the --debug-dump=decodedline option. But the compilation directories are lost and the directory tables aren't used consistently in the output (at least in my version of readelf (i.e. 2.34). Keeping track of this state information is completely impractical in sed, and confusing in awk. Using bash associative arrays was an easy solution that came to mind, so I just used that and parsed the output in a subshell. Perhaps there is a better way.