[arch-dev-public] Git for the repos
So it came up in IRC again and I'll try to sum up the discussion: SVN checkouts tend to break, some people only use it for our repos and not anywhere else, it's slow. We agreed on one git repo per package because you can't do partial checkouts in git and you hardly need the history of all packages anyway. To keep track of released packages, dbscripts maintains it's own (git?) meta database which contains only the package version and pacman repo of the package. The version corresponds to a tag in the package's git repo. We can't use tags like "testing-i686" because you can't reuse tags in git. I'd like to hear some comments about this. -- Florian Pritz
On 25 August 2011 04:53, Florian Pritz <bluewind@xinu.at> wrote:
So it came up in IRC again and I'll try to sum up the discussion:
What is this about, actually? "Moving to Git"?
SVN checkouts tend to break, some people only use it for our repos and not anywhere else, it's slow.
I've never had checkouts "break", but yes most Subversion transactions are slow. -- GPG/PGP ID: 8AADBB10
On 24.08.2011 23:09, Ray Rashif wrote:
On 25 August 2011 04:53, Florian Pritz <bluewind@xinu.at> wrote:
So it came up in IRC again and I'll try to sum up the discussion:
What is this about, actually? "Moving to Git"?
Yeah, it's about moving our repos from svn to git. -- Florian Pritz
Am 24.08.2011 22:53, schrieb Florian Pritz:
So it came up in IRC again and I'll try to sum up the discussion:
SVN checkouts tend to break, some people only use it for our repos and not anywhere else, it's slow.
We agreed on one git repo per package because you can't do partial checkouts in git and you hardly need the history of all packages anyway.
Let's get some details on the layout we discussed. 1) We have two 'package repository' folders, one for the developers, one for the TUs. In there, we have one repository per pkgbase, which contains PKGBUILD and other files. You can use all the magic and awesomeness of git in there, have several branches and whatnot. These repositories get a new tag $pkgname-$pkgrel for each release. These git repositories will not be affiliated with the package repository/db, but only have information about specific versions of packages. devtools will be responsible for creating and cloning them when needed. 2) We have 'repository repositories' (nice name, hah), one for the devs, one for the TUs (or maybe one for both). In there, we have a folder for each package db/repository. These folders will contain a file for each pkgbase that is currently in the repository. The contents of the file should be a reference to the related package repository and the current version (tag). These git repositories would only be automatically be maintained by dbscripts. The advantages of this scheme are: 1) We would use git, instead of shitty subversion. 2) The contents of all repositories will be useful. 3) The history of all repositories will be useful. 4) Everything will be gitweb-friendly. 5) Cats! Cute little cats!
Am 24.08.2011 22:53, schrieb Florian Pritz:
So it came up in IRC again and I'll try to sum up the discussion:
Thanks to everyone involved with pushing this. I can not wait to get rid of SVN. Doing anything but the most trivial operations is a huge PITA. A few comments below: On Wed, Aug 24, 2011 at 11:37 PM, Thomas Bächler <thomas@archlinux.org> wrote:
1) We have two 'package repository' folders, one for the developers, one for the TUs.
In there, we have one repository per pkgbase, which contains PKGBUILD and other files. You can use all the magic and awesomeness of git in there, have several branches and whatnot. These repositories get a new tag $pkgname-$pkgrel for each release.
Was this a typo? I guess the tag should be $pkgver-$pkgrel (as $pkgname is the name of the repo, so not needed, and without $pkgver the tag is not unique).
2) We have 'repository repositories' (nice name, hah), one for the devs, one for the TUs (or maybe one for both). In there, we have a folder for each package db/repository. These folders will contain a file for each pkgbase that is currently in the repository. The contents of the file should be a reference to the related package repository and the current version (tag).
These git repositories would only be automatically be maintained by dbscripts.
This sounds reasonable. My only additional comment is that maybe using git submodules would be useful for these "repository repositories"? If every package repository is a submodule of the repository repository (need a better name for this), then git would keep for us exactly the information you outlined above (if I understood it correctly). The added advantage of using submodules would be that people could checkout the 'repository repositry' (again, needs a new name) without checking out any package repositories, and then use "git submodule update $pkgname" to get just the package repositories they want (in the same way we use "svn update $pkgname" today). Cheers, Tom PS There should be ponies!
On Thu, 2011-08-25 at 00:15 +0200, Tom Gundersen wrote:
Thanks to everyone involved with pushing this. I can not wait to get rid of SVN. Doing anything but the most trivial operations is a huge PITA.
IMHO the only nice features I like from git that aren't in SVN are bisect and the possibility to work on a local copy without being online. The first part is something we don't use for packaging, the 2nd part is not a very big issue if we make our server faster. Right now we're offloading a lot of crap to a new server, so gerolde should get faster that way. Besides speeding up gerolde, there's another thing that makes it slow for SVN: SSH. Make a master login to gerolde and piggy-back your SVN+SSH connections on that, logins are much faster that way.
On Thu, Aug 25, 2011 at 1:24 AM, Jan de Groot <jan@jgc.homeip.net> wrote:
On Thu, 2011-08-25 at 00:15 +0200, Tom Gundersen wrote:
Thanks to everyone involved with pushing this. I can not wait to get rid of SVN. Doing anything but the most trivial operations is a huge PITA.
IMHO the only nice features I like from git that aren't in SVN are bisect and the possibility to work on a local copy without being online.
That, in addition to easy branching/merging, rebasing and reverting. This is all possible with SVN of course, but it becomes much more cumbersome.
The first part is something we don't use for packaging,
We don't use it because we can't. I think we should though, especially for the non-trivial packages (that contain lots of patches or other tweaks). In order for bisection to be useful, we'd need to be able to do smaller patches, and that's not really practical without local commits.
the 2nd part is not a very big issue if we make our server faster.
Local commits have nothing to do with speed, at least not for me (once I push out my patches, I'm done for the day and go to make myself a coffee). Having local commits and simple branching/merging makes it much easier to make atomic commits, even if you are not sure if you want to push them to the repos yet (you want to test the package more first). At the moment I have the choice between making all my changes, testing the resulting package and committing with message "lots of churn". Or committing after each change, but then I'd only end up testing the package after pushing all the commits. All that said, I think the most important point to keep in mind when comparing SVN and git is what are people used to and comfortable with. And people are going to be used to whatever upstream are using (that's after all where we get our patches, and where we do our bisection, submit our fixes, etc). I don't know what the global statistics are like as I only maintain a very few number of packages, but looking at the projects I have currently checked out (my own and related packages): 28 use git, 1 use CVS (but I can't deal with that and use a git mirror) and 1 use svn (that's us). I don't have a full understanding of the details of the proposed git solution, so all I'm saying is that I'm positive to the idea. Cheers, Tom
On 25/08/11 10:49, Tom Gundersen wrote:
On Thu, Aug 25, 2011 at 1:24 AM, Jan de Groot<jan@jgc.homeip.net> wrote:
On Thu, 2011-08-25 at 00:15 +0200, Tom Gundersen wrote:
Thanks to everyone involved with pushing this. I can not wait to get rid of SVN. Doing anything but the most trivial operations is a huge PITA.
IMHO the only nice features I like from git that aren't in SVN are bisect and the possibility to work on a local copy without being online.
That, in addition to easy branching/merging, rebasing and reverting. This is all possible with SVN of course, but it becomes much more cumbersome.
The first part is something we don't use for packaging,
We don't use it because we can't. I think we should though, especially for the non-trivial packages (that contain lots of patches or other tweaks). In order for bisection to be useful, we'd need to be able to do smaller patches, and that's not really practical without local commits.
You would seriously need to bisect while packaging? If you are applying patches to packages, you should know exactly what they are doing, or at least be able to have a damn good idea which patch is causing the issue you are seeing. I say this being the maintainer of one of the most patched packages in our repos (glibc, I think there are only a couple of more patched packages...). I also see no real need for branching/merging/rebasing while packaging either. Doing stuff on trunk has worked fine for me and as far as I can tell is exactly what trunk is for. But in the end, if someone comes up with a solution using git that does not alter my workflow much (archco/svn update; make changes; "commitpkg"; done...), I will accept it for the primary reason that I find SVN to be slow, or at least how we use it in devtools (especially after recent changes). Allan
On 08/25/2011 03:49 AM, Tom Gundersen wrote:
On Thu, Aug 25, 2011 at 1:24 AM, Jan de Groot<jan@jgc.homeip.net> wrote:
On Thu, 2011-08-25 at 00:15 +0200, Tom Gundersen wrote:
Thanks to everyone involved with pushing this. I can not wait to get rid of SVN. Doing anything but the most trivial operations is a huge PITA.
IMHO the only nice features I like from git that aren't in SVN are bisect and the possibility to work on a local copy without being online.
That, in addition to easy branching/merging, rebasing and reverting. This is all possible with SVN of course, but it becomes much more cumbersome.
The first part is something we don't use for packaging,
We don't use it because we can't. I think we should though, especially for the non-trivial packages (that contain lots of patches or other tweaks). In order for bisection to be useful, we'd need to be able to do smaller patches, and that's not really practical without local commits.
the 2nd part is not a very big issue if we make our server faster.
Local commits have nothing to do with speed, at least not for me (once I push out my patches, I'm done for the day and go to make myself a coffee). Having local commits and simple branching/merging makes it much easier to make atomic commits, even if you are not sure if you want to push them to the repos yet (you want to test the package more first).
Local commits, pushing patches? Usually a common commit for packages don't have more than 2 lines of changes. Do you really need local commits for that, branching and merging? What I found it interesting is that pro git are those who don't do a lot of packaging around here and have now only couples of packages in their maintenance. You guys describe a workflow for a coding project and I understand how cool is git, but for packaging I don't see any reason to use it.
At the moment I have the choice between making all my changes, testing the resulting package and committing with message "lots of churn". Or committing after each change, but then I'd only end up testing the package after pushing all the commits.
You are doing testing wrong. What about commit only after everything is tested?
All that said, I think the most important point to keep in mind when comparing SVN and git is what are people used to and comfortable with. And people are going to be used to whatever upstream are using (that's after all where we get our patches, and where we do our bisection, submit our fixes, etc).
I don't know what the global statistics are like as I only maintain a very few number of packages, but looking at the projects I have currently checked out (my own and related packages): 28 use git, 1 use CVS (but I can't deal with that and use a git mirror) and 1 use svn (that's us).
I don't have a full understanding of the details of the proposed git solution, so all I'm saying is that I'm positive to the idea.
Cheers,
Tom
-- Ionuț
On Wed, Aug 24, 2011 at 3:53 PM, Florian Pritz <bluewind@xinu.at> wrote:
So it came up in IRC again and I'll try to sum up the discussion:
Great. Another bitchfest on this topic! I'm going to get a bit pissy here because it isn't like git wasn't around when we moved to SVN, and we came to the conclusion then that the workflow with git was no better/worse than the one proposed by Jason Chu using SVN. Obviously git has changed since then (as has SVN), but we're putting the cart ahead of the horse here- our (well, not mine) insistence on using git to manage packages, rather than actually listing the damn problems we have now in an objective manner.
SVN checkouts tend to break,
some people only use it for our repos and not anywhere else, it's slow. Don't disagree with you on this point, however your amazing git 'one repo per package' idea below sounds like a nightmare for those of us
Examples? Or is this just hearsay like everything else because we're on a witchhunt, and SVN looks like one. I've had a full repo checkout for 2 years that has never broken. Ever. that like having full checkouts and updating them regularly. Submodules or not, its going to be disastrous.
We agreed on one git repo per package because you can't do partial checkouts in git and you hardly need the history of all packages anyway. Oh really? I tend to like having full history and a timeline, especially when I can link it to a move from [testing] to [core] of 60 other packages. So we're throwing any sort of atomicity and ordering away?
You also realize that the full git history of current SVN weighs in at only 120 MB? This isn't all that bad, and at least there we have a bloodline of all packaging activity rather than having that scattered across 4339 git repositories.
To keep track of released packages, dbscripts maintains it's own (git?) meta database which contains only the package version and pacman repo of the package. The version corresponds to a tag in the package's git repo.
We can't use tags like "testing-i686" because you can't reuse tags in git. So how do we know what version is in testing vs what version is in core vs trunk/master? Branches?
I'd like to hear some comments about this. My comments are this- last time I proposed a git solution[1] (yes, I did this back in 2007!) I at least made a sample repo and demo of it, which let people weigh it against the SVN option also on the table at the time. I'm not happy with change just to change, and this current "let's move to git" plan sounds half-baked, and not all repercussions thought out.
Like I said, sorry for being pissy about this. I'm glad the new blood here is trying to push change for the better, but I'm worried you are just pushing change fore the sake of change, given that we've spent time researching and developing our switch from CVS and chose SVN over git at the time. Here is the former discussion lead email: http://mailman.archlinux.org/pipermail/arch-dev-public/2007-October/001904.h... And see the October 2007 archive, search "Killing CVS", for many more: http://mailman.archlinux.org/pipermail/arch-dev-public/2007-October/thread.h... -Dan [1] http://mailman.archlinux.org/pipermail/arch-dev-public/2007-October/002191.h...
On 25 August 2011 07:04, Dan McGee <dpmcgee@gmail.com> wrote:
... SNIP ... Great. Another bitchfest on this topic! I'm going to get a bit pissy here because it isn't like git wasn't around when we moved to SVN, and we came to the conclusion then that the workflow with git was no better/worse than the one proposed by Jason Chu using SVN. ... SNIP ...
I share the same sentiments bit for bit, so I am in full agreement with Dan, JGC, Allan & Ionut. I have always stressed the fact, even just very recently, to someone who was interested in Arch, that even though I use Git personally, Subversion makes the most sense for (our kind of) packaging. Aside from that, it's not about technical prowess, but about the worth of the migration work needed. What do we gain and what do we lose? How much significance do the gains have with regards to our workflow? -- GPG/PGP ID: 8AADBB10
On 08/25/2011 06:08 AM, Ray Rashif wrote:
I share the same sentiments bit for bit, so I am in full agreement with Dan, JGC, Allan& Ionut. I have always stressed the fact, even just very recently, to someone who was interested in Arch, that even though I use Git personally, Subversion makes the most sense for (our kind of) packaging.
I register my vote for keeping SVN, too; it was the right tool for this particular job when we chose it, and I think it's still the right tool for this particular job. - P
On Thu, 25 Aug 2011, 16:08:33 CEST, Paul Mattal <paul@mattal.com> wrote:
I register my vote for keeping SVN, too; it was the right tool for this particular job when we chose it, and I think it's still the right tool for this particular job. Agree. -1 to switch to Git. Our job doesn't need a local server for the commits, we don't need to switch again.
-- Andrea
2011/8/24 Florian Pritz <bluewind@xinu.at>:
So it came up in IRC again and I'll try to sum up the discussion:
SVN checkouts tend to break, some people only use it for our repos and not anywhere else, it's slow.
We agreed on one git repo per package because you can't do partial checkouts in git and you hardly need the history of all packages anyway.
To keep track of released packages, dbscripts maintains it's own (git?) meta database which contains only the package version and pacman repo of the package. The version corresponds to a tag in the package's git repo.
We can't use tags like "testing-i686" because you can't reuse tags in git.
I'd like to hear some comments about this.
-- Florian Pritz
First I need to ask some questions (understant that I don't get why is this getting proposed) Well, why are us needing of git? what goal we want to achieve? .. Devtools have some complications (X number of commits for package i.e) but, this will change with git? how? .. it will git increase the speed or the workflow of our devtools? why git, not hg, darcs or another DVCS? So in resume, (despite the questions) I didn't get why we need to move our actual schema to other stuff, at the end of the day, we have to implement or tools to work above one of those systems on our own devtools, and eventually is trying to pass a circle into a rectangle .. After all I didn't saw any strong points yet, but if there is one at least, i will for sure totally support it. Just my opinion, please don't get this personal, we are a team, we are right to speak, and we must work in some kind of armony, I bet we all want the best for the project, so we are on the same boat .. don't forget that. -- Angel Velásquez angvp @ irc.freenode.net Arch Linux Developer / Trusted User Linux Counter: #359909 http://www.angvp.com
Am Wed, 24 Aug 2011 22:53:55 +0200 schrieb Florian Pritz <bluewind@xinu.at>:
So it came up in IRC again and I'll try to sum up the discussion:
SVN checkouts tend to break, some people only use it for our repos and not anywhere else, it's slow.
Latest devtools changes break some commits not copying files into the repos directories. Subversion doesn't feel that fast. But it has done its job very well for a long time until somebody broke it. So please keep it and fix/revert broken commits. No need to reinvent the whole process again. -Andy
On Thu, Aug 25, 2011 at 1:38 PM, Andreas Radke <andyrtr@archlinux.org> wrote:
Am Wed, 24 Aug 2011 22:53:55 +0200 schrieb Florian Pritz <bluewind@xinu.at>:
So it came up in IRC again and I'll try to sum up the discussion:
SVN checkouts tend to break, some people only use it for our repos and not anywhere else, it's slow.
Latest devtools changes break some commits not copying files into the repos directories.
There is a patch in arch-project ML to fix that. Another +1 for keeping svn.
Subversion doesn't feel that fast. But it has done its job very well for a long time until somebody broke it. So please keep it and fix/revert broken commits. No need to reinvent the whole process again.
-Andy
participants (13)
-
Allan McRae
-
Andrea Scarpino
-
Andreas Radke
-
Dan McGee
-
Eric Bélanger
-
Florian Pritz
-
Ionut Biru
-
Jan de Groot
-
Paul Mattal
-
Ray Rashif
-
Thomas Bächler
-
Tom Gundersen
-
Ángel Velásquez