It is quite a clever idea. I haven't seen this approach before. I haven't looked at it thoroughly, but it looks like you're simply sourcing the PKGBUILD with some trickery not to execute the code. Why then the need for further parsing? Does `set` produce "raw" bash, e.g. 'source=("https://localhost/$pkgname.tgz")'? It seems like bash should be able to do it itself. If that were the case, the parser would be extremely reliable (definitely more so than mine). There are still some "safety" issues involved, although maybe not for your purposes. One major thing is infinite loops - there's no way to break them. I'm sure this parser will be very useful when such things aren't an issue.
You haven't fully understood how it works so I hope you don't mind if I try to explain it again. I first check the PKGBUILD with "/bin/bash -n PKGBUILD". If this command exits without error then the PKGBUILD contains valid syntax, most importantly it does not contain extra closing brackets ("}"). This lets me wrap the entire PKGBUILD in a function, e.g. pkgbuild () { <PKGBUILD> } I can then source the file with Bash without executing any code. The previous check with "bash -n" guarantees that the PKGBUILD can not escape the wrapping function. Because all code is inside a function, sourcing the file does not execute any code at all. Bash simply parses the file and stores the code itself in the "pkgbuild" function, which itself contains other variables and functions (e.g. package_foo, build). Because the code has not been executed, the variables have not been expanded/interpolated and thus still contain things such s "http://example.com/$pkgname-$pkgver.tar", which is why it must still be intepolated by the parser. The advantage of this method is that "set" will print out the "pkgbuild" function and its contents in a canonical form, e.g. all assignments to a variable are on a single line, if/then/else statements follow a single format, etc. This makes it possible to easily parse the assignments themselves, in the order that they occur, without haing to consider all variations of valid whitespace in statements. The parser simply needs to recognize Bash syntax for things such as string substitutions, but this is a relatively limted set so it is not difficult to handle all such cases. The output of "set" also guarantees that you have a representation of all variable assignments (in sequential order, and within their local environment) so you have all the information that you need to interpolate them. You could even handle command output if you wish, using a command white-list to make sure that no trickery is used to run malicious code. Let me repeat that my method does not run any code in the PKGBUILD. I've tested this by including an infinite loop at the top of the file and it was not executed. I actually believe that this method provides a perfectly safe and potentially very reliable method of retrieving all metadata in the PKGBUILD with very little dependencies and considerable portability. Regards, Xyne