Chris “Kwpolska” Warrick wrote:
Why not adapt the actual Bash parser (in C) to only read and do stuff safely? In most cases, this would be enough. In the others, we already have mess in those fields in the AUR. (my C skills are not appropriate for this)
That is basically what needs to be done but it is a difficult task. Even if you can adapt the Bash source code to return the AST, you would still need to create an extensive whitelist of executables (both internal and external) that may be run in order interpolate all of the variables. The code must be able to detect variable settings nested in the package functions, skip commands that do not affect variables (which may require it to work backwards), count loop cycles to prevent infinite loops, track time to prevent timeouts, etc. I have thought about this before when I wrote the Bauerbill PKGBUILD parser, but I gave up trying to find a way to extract the AST using the Bash code. In the end my code would simply wrap the PKGBUILD in a function, source the file, spit it out with "set" to homogenize the syntax, and then parse it with regexes. I started writing a Bash parser in Haskell with Parsec but my free time ran out and I had to move on to other things. I think that approach would work quite well if the Bash sources are too tangled to extract the parser, but it is a huge task for one person (word expansion, string manipulation, all of the built-ins, etc.). I would be willing to collaborate on that as well, if there is any interest. Regards, Xyne