[aur-dev] AUR2

Sun Oct 25 18:58:54 EDT 2009

On Sun, Oct 25, 2009 at 3:13 PM, Laszlo Papp <djszapi at archlinux.us> wrote:

> On Sun, Oct 25, 2009 at 7:52 PM, Sebastian Nowicki <sebnow at gmail.com>
> wrote:
>
> > On Sun, Oct 25, 2009 at 4:25 AM, Ryan Coyner <rcoyner at gmail.com> wrote:
> >
> > > Boys,
> > >
> > > I've been chatting with Laszlo for a bit. It's very refreshing to see
> > some
> > > enthusiasm on the subject of AUR2 again. I've been working on the
> > Lex/Yacc
> > > parser for the PKGBUILD but it hasn't been touched since May because of
> a
> > > number of reasons, but this thread has helped spark my interest again.
> >
> > I had a look at your parser, and it might get the job done now, but I
> > think it's too specific and rigid to allow for extensions to the
> > PKGBUILD format. For instance the recent "split package" support would
> > be quite hard to get working with your design. I have started
> > prototyping another parser, which is a more general bash parser (but
> > specific to PKGBUILDs). It will actually have a symbol lookup table
> > and handle functions and conditionals properly (at least seemingly
> > so). This should reduce code duplication (as was the case with many
> > variables in harc), make it flexible and easier to maintain. I have
> > published the code on github[1], but I advise against cloning it, as I
> > will be altering history.
> >
> > I think this is a better direction to take with the parser, rather
> > than hardcoding the variable formats (url, email, number, etc -
> > they're all strings in bash) and such.
> >
> >
> Yeah, I agree with Sebastian here, we spoke with him about it no need for
> hardcoding, and for some code duplication.
>
>
Sounds good to me.

>
>
> > > Before we start doing some serious coding though, we really need to
> take
> > a
> > > step back and go back to the drawing board.
> > >
> > > >From reading this thread, we can break down what we want to accomplish
> > into
> > > 3 separate components:
> > >
> > > Client - (Web/CLI Frontend)
> > > Data - (PKGBUILD)
> > > Server - (Repository)
> > >
> > > Obviously each component will have its own sub-modules. Let's not get
> > these
> > > core components mixed up though. We first need to completely nail the
> > > specifications of the data and the structure of a source-package
> > > (non-binary).
> >
> > I think for the time being, simply porting the current AUR (with some
> > modifications) to a new, more flexible and maintainable system is
> > critical. It should be much easier to change implementation details
> > later.
> >
> >
> Yeah, but the principle is to be as compatible with the 'old' AUR as we
> can,
> so most of the informations from the database is good to follow.
>

By more flexible and maintanable, I'm assuming the implementation of an API
(to allow clean, command line and web access) and the port of the web
interface to Django which is definitely a much more maintainable framework.

+1 for this notion.

>
>
>
> > > That means comments on Loui's RFC and overall brainstorming of
> > > what consists of a source-package. My comments on the RFC and metadata
> > > structuring:
> > >
> > > 1) I like the direction of the proposed PKGINFO. I don't know if the
> > > pkgbase/pkgname mechanism is the right answer, but making dependencies
> an
> > > attribute of the architecture makes a lot of sense.
> > >
> > > 2) Why aren't we using a markup language like XML or JSON?  It would
> make
> > > parsing really easy, and making clients would be a lot easier. As for
> the
> > > build() function... can't we just use Makefiles? That's essentially
> what
> > it
> > > is. There was actually a thread on the forums about this.
> >
> > I don't really see a reason to get rid of the PKGBUILD format. Sure
> > it's a little hard to parse, but once a parser has been developed,
> > there's no problem. It has all the metadata of the package, and we
> > need to store the PKGBUILD (and other source files) on the server
> > anyway. There's no point in duplicating that data.
> >
> > Are you referring to makepkg in general, or just AUR here? If just AUR
> > I think having a separate Makefile for the build() function is a bit
> > overkill - Makefiles are basically bash scripts wrapped in rules. Why
> > take a bash function out of a bash script to put it in a Makefile
> > which contains bash scripts?  If for makepkg, well maybe there is a
> > better format than PKGBUILDs, but if we were to change the format, I
> > think something more controllable would be better (abstract functions
> > that get parsed and executed safely - not bash scripts). However,
> > that's a bit off-topic.
> >
> >
> I don't think so too Makefile will be okay for bash script handling, in
> fact
> I've never seen Makefile, or autobuild systems for bash scripts :)
> Their purposes is not relevant here, and it would change things
> unneccesarily.
>

Regarding JSON/Makefile, I was suggesting that we dump the PKGBUILD format
altogether and use JSON to carry the metadata and a Makefile to substitute
for the build function. Now, to draw some comparisons:

Advantages of the PKGBUILD format:

1) It's a simple format allows new users to easily contribute to the AUR.
2) It encapsulates both metadata and build instructions in a single file.
Again, promotes simplicity for the user.
3) No need to hack up makepkg.

Advantages of JSON/Makefile:

1) JSON is a standardized format. Very 3rd party friendly.
2) It removes a layer of abstraction that the PKGBUILD format introduces.
Using JSON, you can serialize the metadata and serve that directly to the
web interface. This is very easy to do using Django and simplejson. With the
PKGBUILD format, you have to parse it first to serve it as JSON... it
violates the KISS principle from an engineering perspective.
3) No need to write our custom parser. This is key, in my opinion. Instead,
we'd have to hack up makepkg to accept this new format. Doing that is a lot
easier.

>
>
> > XML, JSON, YAML and other formats are good for the API. The current
> > API uses JSON, and it seems to be the best format. It's easy to parse
> > and concise. I have started designing a comprehensive API on the
> > wiki[2].
> >
> >
> Yeah, as I said more times hehe :) I support absolutely these interfaces,
> mainly JSON, JSON is easy to handle and very portable among languages, so
> it's cool, and at last we were compatible with the 'old' AUR, so maybe the
> existing frontend could do their work with smaller modifications.
>
>
> > > 3) Right now there's a redundancy in how package metadata is treated.
> If
> > you
> > > look at /var/lib/pacman (which deals with binary packages), it splits
> up
> > the
> > > metadata in depends/desc/files/install, etc. If you look at the ABS
> > (deals
> > > with source packages), it splits up the metadata in PKGBUILDs and
> > PKGINFOs.
> > > Why not just use one or the other?
> >
> > I agree that some consistency would be nice, but I think PKGBUILDs are
> > the best choice for AUR, as stated above. As far as I'm concerned,
> > formats that pacman deals with are irrelevant. For AUR, the only
> > relevant application in makepkg.
> >
> > P.S. It seems my previous mail didn't go through.
> >
> > [1] http://github.com/sebnow/pkgparse/tree/experimental
> > [2] http://wiki.archlinux.org/index.php/AUR_2#API
> >
>
> It's true the only relevant application is makepkg, but you need to install
> the whole pacman package into this, so if we needed to install it, then we
> can use some visible functions from the libalpm too, like common functions
> from it.
>
> I spoke with Sebastian about the new database schema for AUR, and he gave
> me
> a starting point for it, it's a very important point of the new generation
> AUR for me to be able to start to work on the server side code, mainly on
> mysql queries, inserting, so please take your opinion/suggestion for it.
>
> http://djszapi.homelinux.net/new_database.schema
>
> My only suggestion is to keep TU_Votes, TU_VoteInfo tables in the database
> at first glance.
>
> Best Regards,
> Laszlo Papp
>

One thing to note is licenses. Some software have custom licenses - how is
that going to be represented? Another thing to keep in mind is possible
integration of the user accounts with other services for the future (bugs,
forums, etc). It would be ideal if the schema is scalable enough to take
that into factor.

-- 
Ryan Coyner [http://ryancoyner.com]