Yes, there's not necessarily a problem here. The wrapper function you construct would be syntactically valid and would execute fine. (It would exit whatever shell it's executed in.)
I mention it to point out that you have to be careful regexping through the source (even the Bash-canonically-formatted source) of a function that Bash says is syntactically valid. There may be parts of that source that you assume to be valid Bash but which aren't, because they come after an exit. There may be other ways to do this too, e.g., with an exec.
The wrapper function is never executed so I don't see how this is an issue. The only possible issue would be that the parser would miss the exit and set variables that are set after it, but that's not really an issue either. If someone creates such a PKGBUILD, the PKGBUILD itself is invalid so it doesn't make any different if the parser assigns variables before or after the exit. This is basically the same as pointing out that if the PKGBUILD specifies an incorrect depends array then the parser will also specify an incorrect depends array. The point of the parser is to grab metadata from a PKGBUILD, not magically determine if the PKGBUILD itself is correct in every possible sense of the word. If the packager provides incorrect or nonsensical data then there is little than can be done. It's not an exploit though, because no malicious code is executed.
E.g.:
TESTFILE=$(cat <<"EOF" echo normal stuff exit 0 any funky stuff I want pkgver=#$#%$%%^&^$@#$$@^ } more funky stuff { EOF ) WRAPPED=$'wrapper() {\n'"$TESTFILE"$'\n}' bash -n <<< "$WRAPPED" echo $? # prints 0 bash <<< "$WRAPPED"$'\ntype wrapper'
prints: ======= wrapper is a function wrapper () { echo normal stuff; exit 0; any funky stuff I want; pkgver=#$#%$%%^ & ^$@#$$@^ } more funky stuff { }
The garbage comes out in the output of `set` too, as well as `type wrapper`.
Now if you try regexping that, without truly parsing it, you may if you're not careful find yourself processing the garbage at the end. That may make exploits possible.
I like the approach you're using, but this loophole occurred to me and I thought you should know about it. You'll have to decide whether it's worth protecting against.
Can you an example of an exploit using this method? I really can't think of any. If you are suggesting that the trailing "{" would throw off the parser then you are wrong. The output of the set, as I've stressed previously, is canonical. This permits the unequivocal parsing of function blocks because they are opened and closed on their own lines and thus code inside the block cannot both pass the validity check and appear on its own line.
But that makes salient the question of what advantage you're aiming to get by what you're doing rather than just checking the PKGBUILD with bash -n and if that works, sourcing it with some appended echos to print out the variables we're interested in. How can you source the packaging functions to get the variables nested in the packaging functions without executing the rest of the code inside those functions? How can you source potentially malicious PKGBUILDs safely?
One possible advantage would be to protect against malicious stuff. If that's your aim, then you should be protecting against exploits like this may make possible.
The only possible vulnerabilty of this method will be in the implementation of a function whilelist (e.g. uname -r). So far I see no other possible exploits which would allow someone to get the parser to run malicious code. Regards, Xyne