[aur-dev] [PATCH] Use bash script to parse pkgbuilds

Sebastian Nowicki sebnow at gmail.com
Fri Jun 20 13:04:32 EDT 2008


On 20/06/2008, at 4:22 PM, Simo Leone wrote:

> On Fri, Jun 20, 2008 at 12:54:29AM +0800, Callan Barrett wrote:
>> Here's another iteration of this patch, I'm still looking for as much
>> input as possible but this is basically what I would push to testing
>> at this point. The script now outputs in a different format to be
>> parsed and there is some cleanup done in pkgsubmit.php to get it
>> working more cleanly with the script.
>>
> Unfortunately Callan and I found a way to easily defeat this tonight,
> the proof-of-concept is attached, the attack is based on this little  
> bit
> about restricted shells (from the manpage):
> ---
> When a command that is found to be a shell script is executed (see  
> COM-
> MAND EXECUTION above), rbash turns off any restrictions  in  the shell
> spawned to execute the script.
> ---
>
> Too bad too, real bash parsing would have been nice :/
>
> -S
> <script.txt><PKGBUILD.txt><fucked.sh>

Perhaps we should write up our own tiny little bash parser. I've never  
really done anything like that before, but after a little googling I  
found two tools that could simplify the whole process a lot; Lex and  
YACC. Lex tokenizes the source code and YACC recognizes higher-level  
patterns (expressions, assignments, etc). I believe the output of YACC  
is compiled as a C program, so we could run that and use the output  
somehow. We mostly only deal with assignments, which should be fairly  
straight forward. I have seen a few (official) PKGBUILDs that used  
loops to generate the source and md5sums arrays, so we might want to  
support that as well. Of course doing so would introduce the problem  
with infinite loops. "If" statements (both the traditional syntax and  
the shorthand) would probably be necessary, since checksums may be  
different for different architectures (different blob for each  
architecture). I don't see any harm in supporting those.

Of course going down this path means a LOT more work and most likely a  
lot more bugs and hair pulling. On the other hand we do have very  
tight control over the parsing and we can modify it at will.




More information about the aur-dev mailing list