[arch-projects] New PKGBUILD parsing php script

tardo tardo at nagi-fanboi.net
Sun Oct 7 01:44:09 EDT 2007


Greetings, I'm really tired so I'll try to keep this as short as possible.
Instead of working on things I should be working on, like my AUR 1.5 
goals, or my homework or my research project, I decided to write a new 
parsing algorithm that hopefully gets adopted into AUR code.

Using PHP_LexerGenerator from the PEAR library, I was able to 
automagically create a php class that parses the entire PKGBUILD for 
sanity, according to current arch pkgbuild standards. I wrote a quick 
script to actually test the algorithm and have been finding bugs here 
and there, but I need a broader audience, and many more test cases. So 
without further ado, the test site:

http://nagi-fanboi.net/arch/pkgbuild/parser.php

Usage is simple; click browse, locate a PKGBUILD, click submit. If 
parsing is successful, you'll see the variables relevant to AUR 
displayed, else you'll get a meaningless error. There are two possible 
ways to get an error currently, one is to ignore the current arch 
pkgbuild standards (which I will highlight common mistakes below), and 
one is to have an actual bug in my code which overlooks some regex. In 
the case yours falls in the latter category, please email me your 
PKGBUILD so I can patch it up (or you can post it on this ML by replying 
to this email).

I should mention that the parsing algorithm is a lexical scanner, and if 
you don't know what that is, just remember that it's very strict. The 
smallest mistake will lead to an error. Here are some common errors I 
found in random PKGBUILDs:

- license is always an array
- pkgname is always small letters
- pkgver cannot have '-'
- no arch=() array
- no license=() array
- custom variables always must start with a _
- (empty variables, while not a standard helps reduce clutter)

... and so on. I've tested many PKGBUILDs locally, but I don't think 
I've covered every case, which is why I'm now posting on the ML for testing.

Here are some of the benefits to using this script:
- every variable in the PKGBUILD is properly parsed, and checked for sanity
- url is properly displayed properly even if custom variables are used 
(which is a problem with the current script)
- the script uses a decent OOP structure, hence every variable is 
readily available once parsed (including arch, groups, even the entire 
build function)
- this script will definitely help enforce pkgbuild standards

Of course, I have bigger goals for this. I plan to completely rewrite 
the pkgsubmit.php script that is currently in use. Using Archive::Tar 
and this lexical scanner should make the pkg submission process less of 
a pain than the current hackish script (no offence).

Long story short, I need testers. If you have PKGBUILDs laying around 
and free time, please help me test it.
Thank you.

- tardo




More information about the arch-projects mailing list