[aur-dev] Git repositories for AUR packages
Hi list, The next major AUR release will come with Git integration. In this email, I am going to summarize previous ideas and make things a bit more concrete (sketch implementation details etc.) If you are new to this topic, please read [1] and [2] first. Our plan is to create one Git repository per package base. Each of the Git repositories contains a PKGBUILD and zero or more additionally required files, so you will no longer need to build a source tarball. There will be a pkgbuild-introspection utility to generate and write .AURINFO metadata without automatically putting that file inside the tarball. Florian had the idea to use a shared Git object database for space efficiency. When creating a new account, users will be able to upload an SSH public key. This key will be stored in the AUR database and can also be changed using the profile edit page. Using this public key, users will be able to push to repositories they maintain using Git over SSH. For authentication, a custom script doing a lookup in the AUR user database will be used as AuthorizedKeysCommand. If a matching key is found, that script prints the corresponding authorized_keys line with a forced command that invokes a wrapper around the Git shell. That wrapper in turn does authorization by checking maintainership and then calls git-shell(1). Using Git hooks, the .AURINFO metadata of a package is parsed when pushing and the AUR package database is updated accordingly. Hooks will also take care of checking the tree objects for huge files etc. The receive.denyNonFastForwards configuration option will be enabled to prevent users from rewriting the history. In order to submit new packages, you will be able to generate empty Git repositories via the AUR web interface. During the transition period, all existing source tarballs will be converted to bare repositories with one initial commit whose tree equals the tarball contents. Instead of the source tarball download link and the PKGBUILD preview, there will be a public clone URL on the package details page. In a second step, cgit will be configured to provide a web-based interface to all the repositories. Then, links to tarball snapshots and to the PKGBUILD preview will be added to the details page as a replacement for the "Download tarball" and "View PKGBUILD" links (all this is supposed to happen before the first release). As soon as all this is set up, we can add support for multiple maintainers and related features. If there are any questions or suggestions regarding this setup, please feel free to ask/reply. Regards, Lukas [1] https://mailman.archlinux.org/pipermail/aur-dev/2013-March/002411.html [2] https://mailman.archlinux.org/pipermail/aur-dev/2014-January/002592.html
Thanks Lukas for the follow up, and also for the work achieved on AUR 3.0.0!
In order to submit new packages, you will be able to generate empty Git repositories via the AUR web interface. During the transition period, all existing source tarballs will be converted to bare repositories with one initial commit whose tree equals the tarball contents.
Does this mean that all existing source tarballs will obtain full .AURINFO metadata? This is especially important for external tools like helpers. As of AUR 3.0.0, it is difficult to use the dependencies info from the RPC interface to solve the dependency tree, since only the newly uploaded packages have the metadata generated. Regards, R.
unsubscribe On June 3, 2014 at 4:32:24 PM, Rémy Marquis (remy.marquis@gmail.com) wrote: Thanks Lukas for the follow up, and also for the work achieved on AUR 3.0.0!
In order to submit new packages, you will be able to generate empty Git repositories via the AUR web interface. During the transition period, all existing source tarballs will be converted to bare repositories with one initial commit whose tree equals the tarball contents.
Does this mean that all existing source tarballs will obtain full .AURINFO metadata? This is especially important for external tools like helpers. As of AUR 3.0.0, it is difficult to use the dependencies info from the RPC interface to solve the dependency tree, since only the newly uploaded packages have the metadata generated. Regards, R.
Hi Rémy, On Tue, 03 Jun 2014 at 22:31:59, Rémy Marquis wrote:
Thanks Lukas for the follow up, and also for the work achieved on AUR 3.0.0!
Thanks!
In order to submit new packages, you will be able to generate empty Git repositories via the AUR web interface. During the transition period, all existing source tarballs will be converted to bare repositories with one initial commit whose tree equals the tarball contents.
Does this mean that all existing source tarballs will obtain full .AURINFO metadata? This is especially important for external tools like helpers. As of AUR 3.0.0, it is difficult to use the dependencies info from the RPC interface to solve the dependency tree, since only the newly uploaded packages have the metadata generated.
Good question. It is unlikely that we will add .AURINFO metadata to each package, since it would require rebuilding all source tarballs in the AUR. I agree that the transition period is a bit messy but I don't think we can do a lot better. Looking at the AUR statistics, however, it should not take too long until the most (working and maintained) packages are supplemented with metadata.
Regards, R.
Regards, Lukas
On Mon, Jun 02, 2014 at 07:08:43PM +0200, Lukas Fleischer wrote:
Hi list,
The next major AUR release will come with Git integration. In this email, I am going to summarize previous ideas and make things a bit more concrete (sketch implementation details etc.)
If you are new to this topic, please read [1] and [2] first.
Our plan is to create one Git repository per package base. Each of the Git repositories contains a PKGBUILD and zero or more additionally required files, so you will no longer need to build a source tarball. There will be a pkgbuild-introspection utility to generate and write .AURINFO metadata without automatically putting that file inside the tarball. Florian had the idea to use a shared Git object database for space efficiency.
When creating a new account, users will be able to upload an SSH public key. This key will be stored in the AUR database and can also be changed using the profile edit page. Using this public key, users will be able to push to repositories they maintain using Git over SSH. For authentication, a custom script doing a lookup in the AUR user database will be used as AuthorizedKeysCommand. If a matching key is found, that script prints the corresponding authorized_keys line with a forced command that invokes a wrapper around the Git shell. That wrapper in turn does authorization by checking maintainership and then calls git-shell(1).
I am very much for this (and [1] is one of my mails from more than a year ago). I would recommend working with gitolite so that we can keep as much of the code we need maintained upstream, as gitolite has already proven its security and its efficiency. Also, it may allow other users to use gitolite with database backends like the AUR without using the AUR itself. I actually think that the space estimates that I had before (~5M per repo) would be way high with a shared object cache, especially because that could be packed very efficiently. I would recommend having a hard limit on the size of these repos, however, and something around 5M should be good to help prevent people from uploading binary bits, as those would probably explode repos and things. From what I have seen, most of the heavy lifting that we would need to do has already been done by things like cgit and gitolite, so really it would only be a matter of getting the authentication in place for the mass of users the AUR has.
Using Git hooks, the .AURINFO metadata of a package is parsed when pushing and the AUR package database is updated accordingly. Hooks will also take care of checking the tree objects for huge files etc. The receive.denyNonFastForwards configuration option will be enabled to prevent users from rewriting the history.
This was the main concern I had when writing [1], as we wold have to parse pkgbuilds in some fashion. With the .AURINFO files now in use, this is much less of a concern.
In order to submit new packages, you will be able to generate empty Git repositories via the AUR web interface. During the transition period, all existing source tarballs will be converted to bare repositories with one initial commit whose tree equals the tarball contents.
Instead of the source tarball download link and the PKGBUILD preview, there will be a public clone URL on the package details page. In a second step, cgit will be configured to provide a web-based interface to all the repositories. Then, links to tarball snapshots and to the PKGBUILD preview will be added to the details page as a replacement for the "Download tarball" and "View PKGBUILD" links (all this is supposed to happen before the first release).
As soon as all this is set up, we can add support for multiple maintainers and related features.
This should be able to be done very similarly to how the official packages are set up, however instead of branches there would be completely separate repositories. One thing that I think should not be done at all is having a scheme like git://.../username/package This would force users to change their upstream URLs based on maintainership, which changes quite a lot on the AUR, but this is something that was probably already thought of. I'd suspect that we would still have to do something like git://.../pa/package but there may be a way around that, even.
If there are any questions or suggestions regarding this setup, please feel free to ask/reply.
I think this is an amazing idea, and I have some good plans for things like pre-commit hooks for local repos that will auto-generate the .AURINFO file for users without them even having to worry about that. If there's any way I can help, I'd be more than willing! [1] https://mailman.archlinux.org/pipermail/aur-dev/2013-March/002411.html [2] https://mailman.archlinux.org/pipermail/aur-dev/2014-January/002592.html Thanks, -- William Giokas | KaiSforza | http://kaictl.net/ GnuPG Key: 0x73CD09CF Fingerprint: F73F 50EF BBE2 9846 8306 E6B8 6902 06D8 73CD 09CF
On Wed, 04 Jun 2014 at 22:16:40, William Giokas wrote:
[...] I am very much for this (and [1] is one of my mails from more than a year ago). I would recommend working with gitolite so that we can keep as much of the code we need maintained upstream, as gitolite has already proven its security and its efficiency. Also, it may allow other users to use gitolite with database backends like the AUR without using the AUR itself.
That was our initial plan but after some investigation it turned out that using gitolite might be much more complicated and inefficient than writing a simple authorization script from scratch: We would need to keep the gitolite and the AUR database in sync etc. See [1] for an initial implementation of a ~50 lines Python script that does all the work.
[...] This would force users to change their upstream URLs based on maintainership, which changes quite a lot on the AUR, but this is something that was probably already thought of. I'd suspect that we would still have to do something like
git://.../pa/package
but there may be a way around that, even.
I don't think we need to do this since the server has been upgraded to use ext4 iirc.
If there are any questions or suggestions regarding this setup, please feel free to ask/reply.
I think this is an amazing idea, and I have some good plans for things like pre-commit hooks for local repos that will auto-generate the .AURINFO file for users without them even having to worry about that.
Great idea! [1] http://git.cryptocrack.de/aur.git/log/?h=git-integration
On Wed, Jun 04, 2014 at 10:36:27PM +0200, Lukas Fleischer wrote:
On Wed, 04 Jun 2014 at 22:16:40, William Giokas wrote:
[...] I am very much for this (and [1] is one of my mails from more than a year ago). I would recommend working with gitolite so that we can keep as much of the code we need maintained upstream, as gitolite has already proven its security and its efficiency. Also, it may allow other users to use gitolite with database backends like the AUR without using the AUR itself.
That was our initial plan but after some investigation it turned out that using gitolite might be much more complicated and inefficient than writing a simple authorization script from scratch: We would need to keep the gitolite and the AUR database in sync etc.
See [1] for an initial implementation of a ~50 lines Python script that does all the work.
Oh, nice. I'll have to look into this more. It looks like a lot of the work has already been started, too! Thanks, -- William Giokas | KaiSforza | http://kaictl.net/ GnuPG Key: 0x73CD09CF Fingerprint: F73F 50EF BBE2 9846 8306 E6B8 6902 06D8 73CD 09CF
2014-06-03 1:08 GMT+08:00 Lukas Fleischer <archlinux@cryptocrack.de>:
Hi list,
The next major AUR release will come with Git integration. In this email, I am going to summarize previous ideas and make things a bit more concrete (sketch implementation details etc.)
It is really glad to see this feature is about to take shape.
If you are new to this topic, please read [1] and [2] first.
Our plan is to create one Git repository per package base. Each of the Git repositories contains a PKGBUILD and zero or more additionally required files, so you will no longer need to build a source tarball. There will be a pkgbuild-introspection utility to generate and write .AURINFO metadata without automatically putting that file inside the tarball. Florian had the idea to use a shared Git object database for space efficiency.
So, this means, all the maintainers have to use git to submit packages? Any thought on keeping the old package submission form and proceed the commit process on the server? which will allow normal people to contribute packages.
When creating a new account, users will be able to upload an SSH public key. This key will be stored in the AUR database and can also be changed using the profile edit page. Using this public key, users will be able to push to repositories they maintain using Git over SSH. For authentication, a custom script doing a lookup in the AUR user database will be used as AuthorizedKeysCommand. If a matching key is found, that script prints the corresponding authorized_keys line with a forced command that invokes a wrapper around the Git shell. That wrapper in turn does authorization by checking maintainership and then calls git-shell(1).
Using Git hooks, the .AURINFO metadata of a package is parsed when pushing and the AUR package database is updated accordingly. Hooks will also take care of checking the tree objects for huge files etc. The receive.denyNonFastForwards configuration option will be enabled to prevent users from rewriting the history.
In order to submit new packages, you will be able to generate empty Git repositories via the AUR web interface. During the transition period, all existing source tarballs will be converted to bare repositories with one initial commit whose tree equals the tarball contents.
Instead of the source tarball download link and the PKGBUILD preview, there will be a public clone URL on the package details page. In a second step, cgit will be configured to provide a web-based interface to all the repositories. Then, links to tarball snapshots and to the PKGBUILD preview will be added to the details page as a replacement for the "Download tarball" and "View PKGBUILD" links (all this is supposed to happen before the first release).
As soon as all this is set up, we can add support for multiple maintainers and related features.
Any thought on package history policy? Does the maintainer to be allowed to rewrite/alter the git history of package? As far as I know, git does not forbid this kind of things, so, what if a ill-intended guy replace the history of a package with his well-crafted history, and completely wiped out the change log of the original package? What is our pollicy or measures on this? From my point of the view, the history of a package is the essiential part of the AUR, which we can know hos the package is envolved and perfacted.
If there are any questions or suggestions regarding this setup, please feel free to ask/reply.
Regards, Lukas
[1] https://mailman.archlinux.org/pipermail/aur-dev/2013-March/002411.html [2] https://mailman.archlinux.org/pipermail/aur-dev/2014-January/002592.html
On Tuesday, June 24, 2014 03:20:08 PM 郑文辉 wrote:
Using Git hooks, the .AURINFO metadata of a package is parsed when pushing and the AUR package database is updated accordingly. Hooks will also take care of checking the tree objects for huge files etc. The receive.denyNonFastForwards configuration option will be enabled to prevent users from rewriting the history.
Any thought on package history policy? Does the maintainer to be allowed to rewrite/alter the git history of package? As far as I know, git does not forbid this kind of things, so, what if a ill-intended guy replace the history of a package with his well-crafted history, and completely wiped out the change log of the original package? What is our pollicy or measures on this? From my point of the view, the history of a package is the essiential part of the AUR, which we can know hos the package is envolved and perfacted.
Just a quick note since the history rewrite question was answered already in a section you quoted: "The receive.denyNonFastForwards configuration option will be enabled to prevent users from rewriting the history." This was also brought up in the discussions leading to this [1] [2] so if you haven't read through them yet, I recommend it. Most edge-cases have already been considered, so the answers to your other questions may be there. -Kevin Ott [1] https://mailman.archlinux.org/pipermail/aur-dev/2013-March/002411.html [2] https://mailman.archlinux.org/pipermail/aur-dev/2014-January/002592.html
On Tue, 24 Jun 2014 at 09:20:08, 郑文辉 (Techlive Zheng) wrote:
[...] So, this means, all the maintainers have to use git to submit packages? Any thought on keeping the old package submission form and proceed the commit process on the server? which will allow normal people to contribute packages. [...]
I don't think the old package submission form will be supported any longer. All that package maintainers need to know is how to use git-add(1), git-commit(1) and git-push(1). Since package maintainers should have some very basic knowledge on VCS anyway, I don't think we expect too much here. Also, note that dropping all the tarball handling code allows us for cleaning up the AUR a lot.
Also, note that dropping all the tarball handling code allows us for cleaning up the AUR a lot.
+1 On 24 June 2014 02:44, Lukas Fleischer <archlinux@cryptocrack.de> wrote:
On Tue, 24 Jun 2014 at 09:20:08, 郑文辉 (Techlive Zheng) wrote:
[...] So, this means, all the maintainers have to use git to submit packages? Any thought on keeping the old package submission form and proceed the commit process on the server? which will allow normal people to contribute packages. [...]
I don't think the old package submission form will be supported any longer. All that package maintainers need to know is how to use git-add(1), git-commit(1) and git-push(1). Since package maintainers should have some very basic knowledge on VCS anyway, I don't think we expect too much here.
Also, note that dropping all the tarball handling code allows us for cleaning up the AUR a lot.
participants (7)
-
Colin Woodbury
-
Jake Champlin
-
Kevin Ott
-
Lukas Fleischer
-
Rémy Marquis
-
William Giokas
-
郑文辉(Techlive Zheng)