[pacman-dev] [RFC] Package parser in python

Xavier shiningxc at gmail.com
Sat Dec 12 10:44:00 EST 2009

On Sat, Dec 12, 2009 at 3:36 PM, Laszlo Papp <djszapi at archlinux.us> wrote:
> On Sat, Dec 12, 2009 at 3:11 PM, Allan McRae <allan at archlinux.org> wrote:
>> Sebastian Nowicki wrote:
>>> As you may have heard, I started a proper PKGBUILD parser[1], which parses
>>> according to shell semantics and does a little interpreting. I just released
>>> the first version, which doesn't handle errors, or multi-line values (like
>>> arrays or escaped newlines) very well. It does however support split
>>> packages. I'm in the process of modifying parched to essentially turn it
>>> into python bindings[2] for pkgparse.
>>> You probably already have a parser at this point, so I'm not sure how
>>> useful this would be to you (it might be overkill anyway), I just though I'd
>>> let you know.
>>> [1]: http://github.com/sebnow/pkgparse
>>> [2]: http://github.com/sebnow/parched/tree/pkgparse_pyrex
>> Looks interesting.  I will take it for a spin later. I assume this is going
>> towards AUR2?
> Yes.
>> I had not done any further work on my parser as I was uncertain what was the
>> best way to go in developing a makepkg test suite.  Given the makepkg test
>> suite will use a safe set of PGKBUILDs, I was thinking of just using bash to
>> parse them.
> http://wiki.archlinux.org/index.php/AUR_2#High_priority
> "Parsing of pkgbuilds, we can no longer use bash to do it because bash
> sucks and is riddled with security flaws. This is really important."
> It was discussed with Louipc too on #archlinux-aur earlier, and on the
> forum too, I don't find the log at this momment :( It's not best
> solution to do it in bash, lex/yacc seems a better solution for it in
> this case.
> Some documentation from Sebastian with that I'm dealing at this momment:
> http://github.com/sebnow/pkgparse/tree/gh-pages

I can't help but think this whole situation is stupid.

I would suppose that PKGBUILDs were written in bash for simplicity
reason : makepkg just needs to source them, and that's it. Whole
parsing done for free.
And now we realize that when using untrusted source, we cannot do that
anymore. And now we basically have to rewrite a bash parser from
scratch. I mean, it's hard to imagine a more flawed design, and more
complex solution to a simple problem.

Somehow we manage to go from a very KISS solution to a completely anti-KISS one.

I only see two solutions :
- we keep using bash, but try to do that in the most restricted
environment possible (e.g. namcap way , or maybe there is something
even more restrictive and secure ?)
- we decide that pkgbuild format is a flawed design, and was too
limited for our needs, and switch to a new one (in which case Xyne's
brainstorming could help : http://xyne.archlinux.ca/ideas/pkgmeta )

More information about the pacman-dev mailing list