[pacman-dev] RFC: Really running package_* functions in rbash during printsrcinfo
Howdy -- I recently had need to dig into the implementation of makepkg --printsrcinfo, and ran into the "running regular expressions against source code" operations in the backend. Obviously, this is not ideal. Indeed, I've previously written packages (doing unusual and typically-undesirable things, granted) with conditional logic *assuming* that actual execution would be taking place. I fully appreciate the decision not to try to go with more expansive attempts at emulating bash parsing/execution in the future, but do folks have any thoughts on **really** executing PKGBUILDs in a restricted environment, including execution of the package_* functions? See a simple sandboxed parser for config files implemented as bash code in code I've written for NixOS at https://github.com/charles-dyfis-net/nixpkgs/blob/f50bfe267a312515d88e86c12a.... We might need a little more complexity here -- using DEBUG traps to avoid "|| exit" logic from aborting, f/e -- but my initial impression is that "more accurate than the current implementation" (and maybe a fair bit faster, if we extract all variables in one subshell per function) is not a hard goal to achieve. Thoughts?
On 1/6/19 1:58 PM, Charles Duffy wrote:
Howdy --
I recently had need to dig into the implementation of makepkg --printsrcinfo, and ran into the "running regular expressions against source code" operations in the backend.
Obviously, this is not ideal. Indeed, I've previously written packages (doing unusual and typically-undesirable things, granted) with conditional logic *assuming* that actual execution would be taking place.
I fully appreciate the decision not to try to go with more expansive attempts at emulating bash parsing/execution in the future, but do folks have any thoughts on **really** executing PKGBUILDs in a restricted environment, including execution of the package_* functions?
See a simple sandboxed parser for config files implemented as bash code in code I've written for NixOS at https://github.com/charles-dyfis-net/nixpkgs/blob/f50bfe267a312515d88e86c12a.... We might need a little more complexity here -- using DEBUG traps to avoid "|| exit" logic from aborting, f/e -- but my initial impression is that "more accurate than the current implementation" (and maybe a fair bit faster, if we extract all variables in one subshell per function) is not a hard goal to achieve.
Thoughts?
How would this work considering that it would have to actually do things like cd into $pkgdir, attempt to run /usr/bin/make, and so on? Setting the PATH to something empty won't help with what I'd guess is the primary use of complex functions in the wild, as discussed here: https://bugs.archlinux.org/task/58776 Namely, executing /usr/bin/perl in order to discover its version and implement dependency ranges. -- Eli Schwartz Bug Wrangler and Trusted User
How it would work is that those operations fail, and we'd let them fail --
we don't need them to succeed for the (global) variables we're there for to
be set.
As a proof-of-concept-y example (obviously, we'd want to suppress all the
"command not found"s, "cd"s, etc unless the user has turned up the
verbosity level a bit), see the data correctly extracted at the end of the
below:
$ env -i PATH=/var/empty ENV='' "$(type -P bash)" -r -c 'eval
"$(&2; package_postgresql >&2; declare -p pkgdesc backup
depends optdepends options install'
On 1/6/19 1:58 PM, Charles Duffy wrote:
Howdy --
I recently had need to dig into the implementation of makepkg --printsrcinfo, and ran into the "running regular expressions against source code" operations in the backend.
Obviously, this is not ideal. Indeed, I've previously written packages (doing unusual and typically-undesirable things, granted) with conditional logic *assuming* that actual execution would be taking place.
I fully appreciate the decision not to try to go with more expansive attempts at emulating bash parsing/execution in the future, but do folks have any thoughts on **really** executing PKGBUILDs in a restricted environment, including execution of the package_* functions?
See a simple sandboxed parser for config files implemented as bash code in code I've written for NixOS at
https://github.com/charles-dyfis-net/nixpkgs/blob/f50bfe267a312515d88e86c12a... .
We might need a little more complexity here -- using DEBUG traps to avoid "|| exit" logic from aborting, f/e -- but my initial impression is that "more accurate than the current implementation" (and maybe a fair bit faster, if we extract all variables in one subshell per function) is not a hard goal to achieve.
Thoughts?
How would this work considering that it would have to actually do things like cd into $pkgdir, attempt to run /usr/bin/make, and so on?
Setting the PATH to something empty won't help with what I'd guess is the primary use of complex functions in the wild, as discussed here: https://bugs.archlinux.org/task/58776
Namely, executing /usr/bin/perl in order to discover its version and implement dependency ranges.
-- Eli Schwartz Bug Wrangler and Trusted User
Apologies for not having fully internalized your post before responding -- you make a good point that there's functionality for which we need real, unrestricted evaluation. Whether that functionality is worthwhile is a different matter -- my immediate use case is one where I care about extracting accurate-as-possible data for a large number of packages *quickly*, and I'm actually somewhat unhappy with how expensive the current approach taken by makepkg is (considerably more subprocesses there than could be strictly needed if we streamlined it).
participants (2)
-
Charles Duffy
-
Eli Schwartz