On Sun, Oct 25, 2009 at 7:52 PM, Sebastian Nowicki <sebnow@gmail.com> wrote:
On Sun, Oct 25, 2009 at 4:25 AM, Ryan Coyner <rcoyner@gmail.com> wrote:
Boys,
I've been chatting with Laszlo for a bit. It's very refreshing to see some enthusiasm on the subject of AUR2 again. I've been working on the Lex/Yacc parser for the PKGBUILD but it hasn't been touched since May because of a number of reasons, but this thread has helped spark my interest again.
I had a look at your parser, and it might get the job done now, but I think it's too specific and rigid to allow for extensions to the PKGBUILD format. For instance the recent "split package" support would be quite hard to get working with your design. I have started prototyping another parser, which is a more general bash parser (but specific to PKGBUILDs). It will actually have a symbol lookup table and handle functions and conditionals properly (at least seemingly so). This should reduce code duplication (as was the case with many variables in harc), make it flexible and easier to maintain. I have published the code on github[1], but I advise against cloning it, as I will be altering history.
I think this is a better direction to take with the parser, rather than hardcoding the variable formats (url, email, number, etc - they're all strings in bash) and such.
Yeah, I agree with Sebastian here, we spoke with him about it no need for hardcoding, and for some code duplication.
Before we start doing some serious coding though, we really need to take a step back and go back to the drawing board.
From reading this thread, we can break down what we want to accomplish into 3 separate components:
Client - (Web/CLI Frontend) Data - (PKGBUILD) Server - (Repository)
Obviously each component will have its own sub-modules. Let's not get these core components mixed up though. We first need to completely nail the specifications of the data and the structure of a source-package (non-binary).
I think for the time being, simply porting the current AUR (with some modifications) to a new, more flexible and maintainable system is critical. It should be much easier to change implementation details later.
Yeah, but the principle is to be as compatible with the 'old' AUR as we can, so most of the informations from the database is good to follow.
That means comments on Loui's RFC and overall brainstorming of what consists of a source-package. My comments on the RFC and metadata structuring:
1) I like the direction of the proposed PKGINFO. I don't know if the pkgbase/pkgname mechanism is the right answer, but making dependencies an attribute of the architecture makes a lot of sense.
2) Why aren't we using a markup language like XML or JSON? It would make parsing really easy, and making clients would be a lot easier. As for the build() function... can't we just use Makefiles? That's essentially what it is. There was actually a thread on the forums about this.
I don't really see a reason to get rid of the PKGBUILD format. Sure it's a little hard to parse, but once a parser has been developed, there's no problem. It has all the metadata of the package, and we need to store the PKGBUILD (and other source files) on the server anyway. There's no point in duplicating that data.
Are you referring to makepkg in general, or just AUR here? If just AUR I think having a separate Makefile for the build() function is a bit overkill - Makefiles are basically bash scripts wrapped in rules. Why take a bash function out of a bash script to put it in a Makefile which contains bash scripts? If for makepkg, well maybe there is a better format than PKGBUILDs, but if we were to change the format, I think something more controllable would be better (abstract functions that get parsed and executed safely - not bash scripts). However, that's a bit off-topic.
I don't think so too Makefile will be okay for bash script handling, in fact I've never seen Makefile, or autobuild systems for bash scripts :) Their purposes is not relevant here, and it would change things unneccesarily.
XML, JSON, YAML and other formats are good for the API. The current API uses JSON, and it seems to be the best format. It's easy to parse and concise. I have started designing a comprehensive API on the wiki[2].
Yeah, as I said more times hehe :) I support absolutely these interfaces, mainly JSON, JSON is easy to handle and very portable among languages, so it's cool, and at last we were compatible with the 'old' AUR, so maybe the existing frontend could do their work with smaller modifications.
3) Right now there's a redundancy in how package metadata is treated. If you look at /var/lib/pacman (which deals with binary packages), it splits up the metadata in depends/desc/files/install, etc. If you look at the ABS (deals with source packages), it splits up the metadata in PKGBUILDs and PKGINFOs. Why not just use one or the other?
I agree that some consistency would be nice, but I think PKGBUILDs are the best choice for AUR, as stated above. As far as I'm concerned, formats that pacman deals with are irrelevant. For AUR, the only relevant application in makepkg.
P.S. It seems my previous mail didn't go through.
[1] http://github.com/sebnow/pkgparse/tree/experimental [2] http://wiki.archlinux.org/index.php/AUR_2#API
It's true the only relevant application is makepkg, but you need to install the whole pacman package into this, so if we needed to install it, then we can use some visible functions from the libalpm too, like common functions from it. I spoke with Sebastian about the new database schema for AUR, and he gave me a starting point for it, it's a very important point of the new generation AUR for me to be able to start to work on the server side code, mainly on mysql queries, inserting, so please take your opinion/suggestion for it. http://djszapi.homelinux.net/new_database.schema My only suggestion is to keep TU_Votes, TU_VoteInfo tables in the database at first glance. Best Regards, Laszlo Papp