[aur-dev] Git repos for AUR packages
Hi, I think the idea of integrating Git with the AUR [1] is a very good one and should be a milestone for the 3.0.0 release. The idea is to create a Git repository per package. Pros: * Full history of each AUR package, even if the maintainer changes. * Lays the foundations for supporting multiple maintainers per package. * Makes it easier to contribute patches (see git-format-patch(1), branches and pull requests). * cgit might do quite a lot of the work required on the front-end side. PKGBUILD previews, history view, tarball generation, Git clone support, ... * Updating packages will be easier (`git pull` followed by `makepkg -i` instead of doing all the work from the web browser or via an AUR helper). Cons: * Needs more space on the AUR server. Currently, an AUR package uses ~17KiB on the official Arch Linux AUR server. This will probably increase by a factor of 10. Shouldn't be too problematic unless we get a lot of new packages or a lot of updates. * More load on the AUR server. Especially if we no longer store tarballs but use cgit to generate them on the fly (needs to be discussed). Migration should be easy since we can use a small shell script to convert all packages into Git repositories. The first idea is to slightly change the package submission process to extract the whole tarball, parse the PKGBUILD and do a Git commit with the tarball content. There will be an additional text field to enter a (part of the) commit message that is used. As mentioned above, all package repositories will be accessible via cgit. The PKGBUILD preview (and maybe also the tarball download) will be replaced with a simple link to cgit. Later, we should think of how to support support for git-push(1). The main issues are * Authentication: Virtual accounts, somehow connected to the AUR DB? * Integration of the PKGBUILD/.AURINFO parser: Git hook? * DoS protection: Quotas, ... Any comments and suggestions are welcome! Regards, Lukas [1] https://bugs.archlinux.org/task/23010
2014/1/7 Lukas Fleischer <archlinux@cryptocrack.de>:
Hi,
I think the idea of integrating Git with the AUR [1] is a very good one and should be a milestone for the 3.0.0 release. The idea is to create a Git repository per package.
Pros:
* Full history of each AUR package, even if the maintainer changes.
* Lays the foundations for supporting multiple maintainers per package.
* Makes it easier to contribute patches (see git-format-patch(1), branches and pull requests).
* cgit might do quite a lot of the work required on the front-end side. PKGBUILD previews, history view, tarball generation, Git clone support, ...
* Updating packages will be easier (`git pull` followed by `makepkg -i` instead of doing all the work from the web browser or via an AUR helper).
Cons:
* Needs more space on the AUR server. Currently, an AUR package uses ~17KiB on the official Arch Linux AUR server. This will probably increase by a factor of 10. Shouldn't be too problematic unless we get a lot of new packages or a lot of updates.
* More load on the AUR server. Especially if we no longer store tarballs but use cgit to generate them on the fly (needs to be discussed).
Migration should be easy since we can use a small shell script to convert all packages into Git repositories.
The first idea is to slightly change the package submission process to extract the whole tarball, parse the PKGBUILD and do a Git commit with the tarball content. There will be an additional text field to enter a (part of the) commit message that is used. As mentioned above, all package repositories will be accessible via cgit. The PKGBUILD preview (and maybe also the tarball download) will be replaced with a simple link to cgit.
Later, we should think of how to support support for git-push(1). The main issues are
* Authentication: Virtual accounts, somehow connected to the AUR DB? * Integration of the PKGBUILD/.AURINFO parser: Git hook? * DoS protection: Quotas, ...
Any comments and suggestions are welcome!
Regards, Lukas
Yes, integrating Git into AUR is good, really good, but if we are going to do it, we need to do it the right way. Here are some thoughts: 1. The official Git URL for a AUR package is like this: http://aur.archlinux.org/packages/xyz.git or http://git.aur.archlinux.org/packages/xyz.git. 2. Each AUR user should be able to have its own fork of a perticular AUR package with a URL like this: http://aur.archlinux.org/account/john/packages/xyz.git. 3. The official Git HEAD for a AUR package is actually just a Git reference to maintainer's fork's HEAD, we just need to change this Git reference if the maintainer changes. Each fork of a package should have its of votes which could be used to determine the maintainer for the official AUR package. And the on the official AUR package's PAGE, all the forks and their votes should be listed for the user to choose. This way, the job to maintain AUR will be more community driven, and the burden for TUs will be much less.
On Tue, 07 Jan 2014 at 13:50:42, 郑文辉 (Techlive Zheng) wrote:
2014/1/7 Lukas Fleischer <archlinux@cryptocrack.de>: [...] Here are some thoughts:
1. The official Git URL for a AUR package is like this: http://aur.archlinux.org/packages/xyz.git or http://git.aur.archlinux.org/packages/xyz.git.
2. Each AUR user should be able to have its own fork of a perticular AUR package with a URL like this: http://aur.archlinux.org/account/john/packages/xyz.git.
3. The official Git HEAD for a AUR package is actually just a Git reference to maintainer's fork's HEAD, we just need to change this Git reference if the maintainer changes. Each fork of a package should have its of votes which could be used to determine the maintainer for the official AUR package. And the on the official AUR package's PAGE, all the forks and their votes should be listed for the user to choose.
This way, the job to maintain AUR will be more community driven, and the burden for TUs will be much less.
Maybe, we should really switch to something like Gitorious or GitLab with a few additions that allow for marking a repository as "official", a PKGBUILD/.AURINFO parser and a customized search function.
On 07.01.2014 18:57, Lukas Fleischer wrote:
On Tue, 07 Jan 2014 at 13:50:42, 郑文辉 (Techlive Zheng) wrote:
2014/1/7 Lukas Fleischer <archlinux@cryptocrack.de>: [...] Here are some thoughts:
1. The official Git URL for a AUR package is like this: http://aur.archlinux.org/packages/xyz.git or http://git.aur.archlinux.org/packages/xyz.git.
2. Each AUR user should be able to have its own fork of a perticular AUR package with a URL like this: http://aur.archlinux.org/account/john/packages/xyz.git.
3. The official Git HEAD for a AUR package is actually just a Git reference to maintainer's fork's HEAD, we just need to change this Git reference if the maintainer changes. Each fork of a package should have its of votes which could be used to determine the maintainer for the official AUR package. And the on the official AUR package's PAGE, all the forks and their votes should be listed for the user to choose.
This way, the job to maintain AUR will be more community driven, and the burden for TUs will be much less.
That's a nice concept, but it sounds to me like it's one step too far for where AUR is at now.
Maybe, we should really switch to something like Gitorious or GitLab with a few additions that allow for marking a repository as "official", a PKGBUILD/.AURINFO parser and a customized search function.
You should give deploying Gitorious a try and you might revise that idea. In my opinion it requires way too much maintenance.
On 07.01.2014 18:57, Lukas Fleischer wrote:
On Tue, 07 Jan 2014 at 13:50:42, 郑文辉 (Techlive Zheng) wrote:
2014/1/7 Lukas Fleischer <archlinux@cryptocrack.de>: 2. Each AUR user should be able to have its own fork of a perticular AUR package with a URL like this: http://aur.archlinux.org/account/john/packages/xyz.git. I'm actually not sure this is such a great idea. It seems like that may lead to a ton of forks of a package for stylistic choice alone. I'm not sure what could serve as a good alternative. 3. The official Git HEAD for a AUR package is actually just a Git reference to maintainer's fork's HEAD, we just need to change this Git reference if the maintainer changes. Each fork of a package should have its of votes which could be used to determine the maintainer for the official AUR package. And the on the official AUR package's PAGE, all the forks and their votes should be listed for the user to choose. This is a fine idea, but the TUs should still have full control when
On 2014-01-07 17:35, Daniel Albers wrote: they need it (they should be able to delete a package('s repo) change maintainer, etc.
This way, the job to maintain AUR will be more community driven, and the burden for TUs will be much less. That sounds fine in theory, but the TUs are tasked with maintaining the ALUR, so this change (though it will make things more community-oriented) shouldn't hinder their ability to do so.
Overall, I'd love to have the ALUR moved to be much more integrated with Git (I think this recommendation may have actually been drawn from one I made in the BBS thread) since it would make so many things so much simpler. -- All the best, Sam Stuewe (HalosGhost)
On 14-01-07, Lukas Fleischer wrote:
On Tue, 07 Jan 2014 at 13:50:42, 郑文辉 (Techlive Zheng) wrote:
2014/1/7 Lukas Fleischer <archlinux@cryptocrack.de>: [...] Here are some thoughts:
1. The official Git URL for a AUR package is like this: http://aur.archlinux.org/packages/xyz.git or http://git.aur.archlinux.org/packages/xyz.git.
2. Each AUR user should be able to have its own fork of a perticular AUR package with a URL like this: http://aur.archlinux.org/account/john/packages/xyz.git.
3. The official Git HEAD for a AUR package is actually just a Git reference to maintainer's fork's HEAD, we just need to change this Git reference if the maintainer changes. Each fork of a package should have its of votes which could be used to determine the maintainer for the official AUR package. And the on the official AUR package's PAGE, all the forks and their votes should be listed for the user to choose.
This way, the job to maintain AUR will be more community driven, and the burden for TUs will be much less.
Maybe, we should really switch to something like Gitorious or GitLab with a few additions that allow for marking a repository as "official", a PKGBUILD/.AURINFO parser and a customized search function.
We need something similar to Gitorious or GitLab, but with much simpler functionality, the complex pull/merge work flow is over weighted for AUR. Actually, what I suggest, per-user PKGBUILD submitting, could be done without the Git, I will elaborate it in detail in another email.
On 14-01-07, Lukas Fleischer wrote:
Hi,
I think the idea of integrating Git with the AUR [1] is a very good one and should be a milestone for the 3.0.0 release. The idea is to create a Git repository per package.
Pros:
* Full history of each AUR package, even if the maintainer changes.
* Lays the foundations for supporting multiple maintainers per package.
* Makes it easier to contribute patches (see git-format-patch(1), branches and pull requests).
* cgit might do quite a lot of the work required on the front-end side. PKGBUILD previews, history view, tarball generation, Git clone support, ...
* Updating packages will be easier (`git pull` followed by `makepkg -i` instead of doing all the work from the web browser or via an AUR helper).
Cons:
* Needs more space on the AUR server. Currently, an AUR package uses ~17KiB on the official Arch Linux AUR server. This will probably increase by a factor of 10. Shouldn't be too problematic unless we get a lot of new packages or a lot of updates.
* More load on the AUR server. Especially if we no longer store tarballs but use cgit to generate them on the fly (needs to be discussed).
Migration should be easy since we can use a small shell script to convert all packages into Git repositories.
The first idea is to slightly change the package submission process to extract the whole tarball, parse the PKGBUILD and do a Git commit with the tarball content. There will be an additional text field to enter a (part of the) commit message that is used. As mentioned above, all package repositories will be accessible via cgit. The PKGBUILD preview (and maybe also the tarball download) will be replaced with a simple link to cgit.
Later, we should think of how to support support for git-push(1). The main issues are
* Authentication: Virtual accounts, somehow connected to the AUR DB? * Integration of the PKGBUILD/.AURINFO parser: Git hook? * DoS protection: Quotas, ...
I don't think we should support `git-push` at all, the reasons are simple: * Git allows overwriting the history by doing a force push `git push -f`. As a community PKGBUILD publishing platform, the git history of a PKGBUILD should not be allowed to be tampered with, whether accidently or intentionally, it should reflect how the PKGBUILD envloved from the start, not the one someone carefully crafted. * Changed history will cause conflit on `git pull`, which is not something we want to deal with everyday. Instead, we should stick on the `src.tar.gz` tarball submitting, and make the Git commit on the server. At least, push access should not be granted to normal user, only to TUs.
Any comments and suggestions are welcome!
Regards, Lukas
On 14-01-09, Techlive Zheng wrote:
On 14-01-07, Lukas Fleischer wrote:
Hi,
I think the idea of integrating Git with the AUR [1] is a very good one and should be a milestone for the 3.0.0 release. The idea is to create a Git repository per package.
Pros:
* Full history of each AUR package, even if the maintainer changes.
* Lays the foundations for supporting multiple maintainers per package.
* Makes it easier to contribute patches (see git-format-patch(1), branches and pull requests).
* cgit might do quite a lot of the work required on the front-end side. PKGBUILD previews, history view, tarball generation, Git clone support, ...
* Updating packages will be easier (`git pull` followed by `makepkg -i` instead of doing all the work from the web browser or via an AUR helper).
Cons:
* Needs more space on the AUR server. Currently, an AUR package uses ~17KiB on the official Arch Linux AUR server. This will probably increase by a factor of 10. Shouldn't be too problematic unless we get a lot of new packages or a lot of updates.
* More load on the AUR server. Especially if we no longer store tarballs but use cgit to generate them on the fly (needs to be discussed).
Migration should be easy since we can use a small shell script to convert all packages into Git repositories.
The first idea is to slightly change the package submission process to extract the whole tarball, parse the PKGBUILD and do a Git commit with the tarball content. There will be an additional text field to enter a (part of the) commit message that is used. As mentioned above, all package repositories will be accessible via cgit. The PKGBUILD preview (and maybe also the tarball download) will be replaced with a simple link to cgit.
Later, we should think of how to support support for git-push(1). The main issues are
* Authentication: Virtual accounts, somehow connected to the AUR DB? * Integration of the PKGBUILD/.AURINFO parser: Git hook? * DoS protection: Quotas, ...
I don't think we should support `git-push` at all, the reasons are simple:
* Git allows overwriting the history by doing a force push `git push -f`. As a community PKGBUILD publishing platform, the git history of a PKGBUILD should not be allowed to be tampered with, whether accidently or intentionally, it should reflect how the PKGBUILD envloved from the start, not the one someone carefully crafted.
* Changed history will cause conflit on `git pull`, which is not something we want to deal with everyday.
Instead, we should stick on the `src.tar.gz` tarball submitting, and make the Git commit on the server.
At least, push access should not be granted to normal user, only to TUs.
Also, if we allow normal user to push directly with Git, it will be harder to do sanity check. One can push anything, not just the packaging files, but anything, binaries, compressed source/build tarballs, even files unrelated to Arch packaging at all. These malformed files can exist not only in the Git HEAD, but can be intentionally hided in the history, makes it hard to control the space quotas. We'd better only access 'src.tar.gz' tarball and control the commit process on the server on our own, so that we can do necessary sanity check to ensure files to be commited are really what they claim to be.
Any comments and suggestions are welcome!
Regards, Lukas
On Thu, 09 Jan 2014 at 12:10:27, Techlive Zheng wrote:
[...]
I don't think we should support `git-push` at all, the reasons are simple:
* Git allows overwriting the history by doing a force push `git push -f`. As a community PKGBUILD publishing platform, the git history of a PKGBUILD should not be allowed to be tampered with, whether accidently or intentionally, it should reflect how the PKGBUILD envloved from the start, not the one someone carefully crafted.
* Changed history will cause conflit on `git pull`, which is not something we want to deal with everyday.
You can reject non-fast-forwards by enabling receive.denyNonFastforwards on the server.
Instead, we should stick on the `src.tar.gz` tarball submitting, and make the Git commit on the server.
At least, push access should not be granted to normal user, only to TUs.
Why? I agree that keeping the tarball submission form and doing Git commits on the server is a nice first step but providing Git push access is much more convenient...
Also, if we allow normal user to push directly with Git, it will be harder to do sanity check. One can push anything, not just the packaging files, but anything, binaries, compressed source/build tarballs, even files unrelated to Arch packaging at all. These malformed files can exist not only in the Git HEAD, but can be intentionally hided in the history, makes it hard to control the space quotas.
I guess this could be done in a pre-receive hook. We also need to perform these checks when committing tarball contents -- I don't think it will be much harder to do it inside a Git hook.
We'd better only access 'src.tar.gz' tarball and control the commit process on the server on our own, so that we can do necessary sanity check to ensure files to be commited are really what they claim to be.
Any comments and suggestions are welcome!
Regards, Lukas
All, I think that everyone has been looking at gitorious and gitlab, the main systems with front-ends for their users. We already have a front-end for our users, all we need is the backend. I have given it some thought previously (if you want, look up the mail), and one of the best solutions seems to be gitolite[1]. We would need a way to work with the permission config files based on AUR users signing up and uploading files, however I think that this is doable. On Thu, Jan 09, 2014 at 07:10:27PM +0800, Techlive Zheng wrote:
on 14-01-09, techlive zheng wrote:
I don't think we should support `git-push` at all, the reasons are simple:
* Git allows overwriting the history by doing a force push `git push -f`. As a community PKGBUILD publishing platform, the git history of a PKGBUILD should not be allowed to be tampered with, whether accidently or intentionally, it should reflect how the PKGBUILD envloved from the start, not the one someone carefully crafted.
* Changed history will cause conflit on `git pull`, which is not something we want to deal with everyday.
Instead, we should stick on the `src.tar.gz` tarball submitting, and make the Git commit on the server.
At least, push access should not be granted to normal user, only to TUs.
All of this can be disabled in any good git management platform, I know gitolite does. We could use the aur-dev mailing list to request force-pushes if need be.
Also, if we allow normal user to push directly with Git, it will be harder to do sanity check. One can push anything, not just the packaging files, but anything, binaries, compressed source/build tarballs, even files unrelated to Arch packaging at all. These malformed files can exist not only in the Git HEAD, but can be intentionally hided in the history, makes it hard to control the space quotas.
Not really. We can use git hooks to check the size of the files that are pushed, and it won't make it any harder to control space quotas. If the repository goes over a couple of megabytes, then they can't push. End of story. Another thing that people can do is use submodules[2] in their AUR repositories if they have some kind of binary file they can't get rid of. This would keep the size on the AUR down while also allowing them to have a 4M image in their AUR package.
We'd better only access 'src.tar.gz' tarball and control the commit process on the server on our own, so that we can do necessary sanity check to ensure files to be commited are really what they claim to be.
While this is an option, I don't think that the arguments above make it the better option of the two. In fact, something that we may want to look into is keeping the legacy .src.tar.gz files around, but also having users with the ability to `git push` to their repositories. [1]: http://gitolite.com/gitolite/index.html [2]: http://git-scm.com/book/en/Git-Tools-Submodules Thanks, -- William Giokas | KaiSforza | http://kaictl.net/ GnuPG Key: 0x73CD09CF Fingerprint: F73F 50EF BBE2 9846 8306 E6B8 6902 06D8 73CD 09CF
On Tue, Jan 07, 2014 at 11:03:08AM +0100, Lukas Fleischer wrote:
Hi,
I think the idea of integrating Git with the AUR [1] is a very good one and should be a milestone for the 3.0.0 release. The idea is to create a Git repository per package.
Pros:
* Full history of each AUR package, even if the maintainer changes.
* Lays the foundations for supporting multiple maintainers per package.
* Makes it easier to contribute patches (see git-format-patch(1), branches and pull requests).
* cgit might do quite a lot of the work required on the front-end side. PKGBUILD previews, history view, tarball generation, Git clone support, ...
Just looking at the AUR and the official package pages, it looks like this would bring it much closer to the latter, possibly simplifying development in the future.
* Updating packages will be easier (`git pull` followed by `makepkg -i` instead of doing all the work from the web browser or via an AUR helper).
And people can use submodules for large amounts of AUR package!
Cons:
* Needs more space on the AUR server. Currently, an AUR package uses ~17KiB on the official Arch Linux AUR server. This will probably increase by a factor of 10. Shouldn't be too problematic unless we get a lot of new packages or a lot of updates.
You could still have a tiny quota for package updates, and git will compress its contents quite well, especially considering that 99% of what will be uploaded to the AUR will be plain text files.
* More load on the AUR server. Especially if we no longer store tarballs but use cgit to generate them on the fly (needs to be discussed).
With caching and the efficiency of cgit, I think that this will be better than expected.
Migration should be easy since we can use a small shell script to convert all packages into Git repositories.
The first idea is to slightly change the package submission process to extract the whole tarball, parse the PKGBUILD and do a Git commit with the tarball content. There will be an additional text field to enter a (part of the) commit message that is used. As mentioned above, all package repositories will be accessible via cgit. The PKGBUILD preview (and maybe also the tarball download) will be replaced with a simple link to cgit.
Later, we should think of how to support support for git-push(1). The main issues are
* Authentication: Virtual accounts, somehow connected to the AUR DB?
SSH keys? Users should be uploading packages from a system that can use SSH keys, so this shouldn't be an inconvenience to anyone.
* Integration of the PKGBUILD/.AURINFO parser: Git hook? * DoS protection: Quotas, ... I'd say 2 quotas, one for the update size (files being pushed that will be the new HEAD) and one for the total repo size (somewhat larger, maybe 1M). As I said in another mail, people can use submodules for non-PKGBUILD/.install files as well.
Thanks, -- William Giokas | KaiSforza | http://kaictl.net/ GnuPG Key: 0x73CD09CF Fingerprint: F73F 50EF BBE2 9846 8306 E6B8 6902 06D8 73CD 09CF
participants (6)
-
Daniel Albers
-
Lukas Fleischer
-
Sam Stuewe
-
Techlive Zheng
-
William Giokas
-
郑文辉(Techlive Zheng)