[pacman-dev] The great VCS packages overhaul of 2015
It is well know that the VCS package support in makepkg is subpar... The number of bugs about this is into double figures. There have also been some patches submitted attempting to be able to use the source=() array in PKGBUILDs to specify a VCS repo and remove a lot of the repetitive crap involved for repo checkout/updating in the current PKGBUILDs. To get the ball rolling again, I think we should pick one VCS system and flesh out what we need and what the prototype PKGBUILD would look like. Then we can move on to the other VCS systems and finally implement it. I guess by the time all the bikeshedding is finished, this will all be done by 2015 and hence the subject line. :) I am going to start with git. Current: pkgver = the date the package was built _gitroot = the url of the git repo _gitname = the name of the directory we check out the repo into... According to the man page, _gitname is supposed to be a branch or tag but that is all lies. What would we like to have for a flexible git package implementation? - url of the repo - be able to specify the branch/tag/commit to use (and appropriate combinations) - a decent pkgver 1) URL: There were previous patches to the mailing list that never really got finished, but I think we were fairly happy with this syntax: source=(git://projects.archlinux.org/pacman.git) source=(git@@http://projects.archlinux.org/git/pacman.git) Does it make any sense to allow the "::" syntax here? i.e. source=(git@@dirname::http://projects.archlinux.org/git/pacman.git) where dirname is the name of the directory it checked out into? I am thinking we should probably do the checkout into $vcsdir=$startdir/vcs, so this would only be needed if #1 - we supported multiple VCS checkouts in one PKGBUILD there were two that wanted to used the same name... #2 - another source file conflicted #2 is readily dealt with and I am not sure we should allow #1 (see below) so I would vote to skip it until a genuine need is shown. 2) Specifying commit to work with: I think that this is the difficult bit... the syntax with the source array is already convoluted enough, so I do not think they should be added there. So that suggests we go with assigning them to variables like _git_branch, _git_commit, _git_tag... etc. But what if we have two git sources? For example, say pacman allowed building against an internal copy of libarchive if a folder named libarchive was found in its root directory. So: source=(git://projects.archlinux.org/pacman.git git://github.com/libarchive/libarchive) build() { cd $srcdir/pacman ln -s ../libarchive ./autogen.sh ... It might seem a somewhat convoluted example, but (e.g.) gcc does allow in source tree building of many of its dependencies. The question is should we consider this outside the realms of the reasonable and state one VCS repo should be one package. I'd say 99.999% of VCS PKGBUILDs (at a lower bound) would never use two VCS sources... and the ones that do need this could do manual checking out of the non-main source within the PKGBUILD anyway. 3) pkgver Use output of "git describe" (with a s/-/_/) and fall back to "git rev-list HEAD | wc -l" (with a trailing commit id added) if there are no tags in the repo. PROTOTYPE: pkgname=pacman-git pkgver=AUTO ... source=(git://projects.archlinux.org/users/allan/pacman.git) _git_branch="working" _git_commit="8c857343" build() { cd $srcdir/pacman ./autogen.sh ... } What makepkg does: 1) goes into $vcsdir, checks for the pacman directory - if not present, do the git checkout - if present, enter and do a "git pull" unless --holdver is specified 2) enters $srcdir, and does the appropriate clone of the repo in $vscdir to be at the required branch/tag/commit 3) starts build() etc... Oh wow... you are still reading this? Well then, you are now qualified to comment on the proposal. Please stick to just concerns with the git usage at the moment, unless you see something monumental that we will not be able to support using other VCS systems using this approach. Allan
On 03/12/2012 09:58 PM, Allan McRae wrote:
It is well know that the VCS package support in makepkg is subpar... The number of bugs about this is into double figures. There have also been some patches submitted attempting to be able to use the source=() array in PKGBUILDs to specify a VCS repo and remove a lot of the repetitive crap involved for repo checkout/updating in the current PKGBUILDs.
To get the ball rolling again, I think we should pick one VCS system and flesh out what we need and what the prototype PKGBUILD would look like. Then we can move on to the other VCS systems and finally implement it. I guess by the time all the bikeshedding is finished, this will all be done by 2015 and hence the subject line. :)
I am going to start with git.
Current: pkgver = the date the package was built _gitroot = the url of the git repo _gitname = the name of the directory we check out the repo into...
According to the man page, _gitname is supposed to be a branch or tag but that is all lies.
What would we like to have for a flexible git package implementation? - url of the repo - be able to specify the branch/tag/commit to use (and appropriate combinations) - a decent pkgver
1) URL: There were previous patches to the mailing list that never really got finished, but I think we were fairly happy with this syntax:
source=(git://projects.archlinux.org/pacman.git) source=(git@@http://projects.archlinux.org/git/pacman.git)
Does it make any sense to allow the "::" syntax here? i.e. source=(git@@dirname::http://projects.archlinux.org/git/pacman.git)
where dirname is the name of the directory it checked out into? I am thinking we should probably do the checkout into $vcsdir=$startdir/vcs, so this would only be needed if #1 - we supported multiple VCS checkouts in one PKGBUILD there were two that wanted to used the same name... #2 - another source file conflicted
#2 is readily dealt with and I am not sure we should allow #1 (see below) so I would vote to skip it until a genuine need is shown.
2) Specifying commit to work with: I think that this is the difficult bit... the syntax with the source array is already convoluted enough, so I do not think they should be added there. So that suggests we go with assigning them to variables like _git_branch, _git_commit, _git_tag... etc.
But what if we have two git sources? For example, say pacman allowed building against an internal copy of libarchive if a folder named libarchive was found in its root directory. So:
source=(git://projects.archlinux.org/pacman.git git://github.com/libarchive/libarchive)
refs=(master v3.14.1) ? Non-vcs sources shouldn't need a placeholder or anything like that. Of course, instead of sources there could be a repos=() array or something like that. This would alleviate the syntactic issue of trying to specify a repo, (optional) dir to clone into, and ref to checkout in a succinct way. repos=([dir::]url[::ref])
build() { cd $srcdir/pacman ln -s ../libarchive ./autogen.sh ...
It might seem a somewhat convoluted example, but (e.g.) gcc does allow in source tree building of many of its dependencies. The question is should we consider this outside the realms of the reasonable and state one VCS repo should be one package. I'd say 99.999% of VCS PKGBUILDs (at a lower bound) would never use two VCS sources... and the ones that do need this could do manual checking out of the non-main source within the PKGBUILD anyway.
3) pkgver Use output of "git describe" (with a s/-/_/) and fall back to "git rev-list HEAD | wc -l" (with a trailing commit id added) if there are no tags in the repo.
What if there are no tags in the repo, and for whatever reason the PKGBUILD switches to a different branch? The number of commits just doesn't seem that stable. Is it at all feasible to just use the build date for package ordering if it's a vcs package? This way the pkgver can be the ref that was checked out: - the short hash - the tag - the branch (_short hash maybe OR only automatically bump the pkgrel)
PROTOTYPE:
pkgname=pacman-git pkgver=AUTO ... source=(git://projects.archlinux.org/users/allan/pacman.git) _git_branch="working" _git_commit="8c857343"
Are these looking forward to other VCS's though? What scenario is more than one ref needed? Makepkg would just be doing a checkout right?
build() { cd $srcdir/pacman ./autogen.sh ... }
What makepkg does: 1) goes into $vcsdir, checks for the pacman directory - if not present, do the git checkout - if present, enter and do a "git pull" unless --holdver is specified 2) enters $srcdir, and does the appropriate clone of the repo in $vscdir to be at the required branch/tag/commit 3) starts build() etc...
Seems like there's a lot of stuff that can be shared up front, but at some point there are really two different types of VCS packages: those that should pull automatically every time the pkg is built (branch), and those that use vcs for obtaining the source, but are really meant to target a specific version (tag,commit). In general, I'll say that I really don't like makepkg manipulating my PKGBUILDs. So it'd be cool if all of this produced sane package versions in the built package, but didn't need to touch the PKGBUILD itself.
Oh wow... you are still reading this? Well then, you are now qualified to comment on the proposal. Please stick to just concerns with the git usage at the moment, unless you see something monumental that we will not be able to support using other VCS systems using this approach.
Allan
On 13/03/12 15:36, Matthew Monaco wrote:
On 03/12/2012 09:58 PM, Allan McRae wrote:
2) Specifying commit to work with: I think that this is the difficult bit... the syntax with the source array is already convoluted enough, so I do not think they should be added there. So that suggests we go with assigning them to variables like _git_branch, _git_commit, _git_tag... etc.
But what if we have two git sources? For example, say pacman allowed building against an internal copy of libarchive if a folder named libarchive was found in its root directory. So:
source=(git://projects.archlinux.org/pacman.git git://github.com/libarchive/libarchive)
refs=(master v3.14.1) ?
Non-vcs sources shouldn't need a placeholder or anything like that.
Of course, instead of sources there could be a repos=() array or something like that. This would alleviate the syntactic issue of trying to specify a repo, (optional) dir to clone into, and ref to checkout in a succinct way.
repos=([dir::]url[::ref])
I would really prefer not to have a new array for the repo source. Part of the attraction of getting all this into the source array is that we only have to look in one place to see where the source came from.
build() { cd $srcdir/pacman ln -s ../libarchive ./autogen.sh ...
It might seem a somewhat convoluted example, but (e.g.) gcc does allow in source tree building of many of its dependencies. The question is should we consider this outside the realms of the reasonable and state one VCS repo should be one package. I'd say 99.999% of VCS PKGBUILDs (at a lower bound) would never use two VCS sources... and the ones that do need this could do manual checking out of the non-main source within the PKGBUILD anyway.
3) pkgver Use output of "git describe" (with a s/-/_/) and fall back to "git rev-list HEAD | wc -l" (with a trailing commit id added) if there are no tags in the repo.
What if there are no tags in the repo, and for whatever reason the PKGBUILD switches to a different branch? The number of commits just doesn't seem that stable.
No tags in the repo is covered by the rev-list option. I agree that these numbers are not particularly stable (with switching branches etc), but I would assume that someone providing a repo with a VCS package will not be branch hopping too often and for local packaging it really does not matter.
Is it at all feasible to just use the build date for package ordering if it's a vcs package? This way the pkgver can be the ref that was checked out:
- the short hash - the tag - the branch (_short hash maybe OR only automatically bump the pkgrel)
I would really like to move away from the build date having anything to do with package ordering as it is entirely non-deterministic about what is in a package. Also, that would require pacman to know which packages were built from a VCS style PKGBUILD and all the extra tracking to implement that...
PROTOTYPE:
pkgname=pacman-git pkgver=AUTO ... source=(git://projects.archlinux.org/users/allan/pacman.git) _git_branch="working" _git_commit="8c857343"
Are these looking forward to other VCS's though? What scenario is more than one ref needed? Makepkg would just be doing a checkout right?
This might be me not understanding some of the subtleties of git, but this is what I see as options for building from a git repo... 1) build from master HEAD 2) build from a branch HEAD 3) build from a given commit/tag So, assuming that the git repo is checked out from upstream in $vcsdir, the command needed for each of these is: 1) git clone --branch master $vcsdir [*] 2) git clone --branch <branch> $vcsdir 3) git clone $vcsdir; git checkout <ref> [*] this fails if the default branch upstream is not master, but does allow you to either do some work in the copy in $vcsdir or have your PKGBUILD git source point to a local copy of a git repo that you actually do work in... Anyway, from what I can see from those three use cases is that to be flexible here, we need to be able to specify a branch or a ref (commit/tag). I'm not sure how to differentiate the two without assigning them to different variables. And while it is not necessary, I would find it useful to have the branch specified even when the commit/tag is used more for documentation than anything else.
build() { cd $srcdir/pacman ./autogen.sh ... }
What makepkg does: 1) goes into $vcsdir, checks for the pacman directory - if not present, do the git checkout - if present, enter and do a "git pull" unless --holdver is specified 2) enters $srcdir, and does the appropriate clone of the repo in $vscdir to be at the required branch/tag/commit 3) starts build() etc...
Seems like there's a lot of stuff that can be shared up front, but at some point there are really two different types of VCS packages: those that should pull automatically every time the pkg is built (branch), and those that use vcs for obtaining the source, but are really meant to target a specific version (tag,commit).
There is quite some overlap there... I quite often have commits on my local pacman working branch that I do not want running on my live system just yet. So my pacman-git package gets built from my working branch with the last commit I am happy with optionally specified.
In general, I'll say that I really don't like makepkg manipulating my PKGBUILDs. So it'd be cool if all of this produced sane package versions in the built package, but didn't need to touch the PKGBUILD itself.
I tend to agree about makepkg not adjusting PKGBUILDs. Although, it does have the advantage that after building a package, you can "makepkg --source" and send that archive to someone and they will be able to build the exact same package as you using the --holdver flag (or at least will be able to when --holdver works...). Allan
Am 13.03.2012 09:21, schrieb Allan McRae:
Is it at all feasible to just use the build date for package ordering if it's a vcs package? This way the pkgver can be the ref that was checked out:
- the short hash - the tag - the branch (_short hash maybe OR only automatically bump the pkgrel)
I would really like to move away from the build date having anything to do with package ordering as it is entirely non-deterministic about what is in a package.
I cannot ACK that enough. Please, no build dates!
source=(git://projects.archlinux.org/users/allan/pacman.git) _git_branch="working" _git_commit="8c857343"
Are these looking forward to other VCS's though? What scenario is more than one ref needed? Makepkg would just be doing a checkout right?
This might be me not understanding some of the subtleties of git, but this is what I see as options for building from a git repo...
1) build from master HEAD 2) build from a branch HEAD 3) build from a given commit/tag
So, assuming that the git repo is checked out from upstream in $vcsdir, the command needed for each of these is:
1) git clone --branch master $vcsdir [*] 2) git clone --branch <branch> $vcsdir 3) git clone $vcsdir; git checkout <ref>
You complicate things A LOT here. Specifying the branch and commit separately is nonsense. The commit id already tells you exactly where you want to be. This is the right thing to do: $ git clone $vcsdir $ git checkout -b makepkg $ref Here, $ref can be a tag name, a branch name, a commit id or an abbreviated commit id - everything git understands as a refspec. Let's just give git a single refspec and let it handle the rest. (I just realized I might be slightly wrong here, as we refer to branch and tag names in a remote repository of a remote repository - but I am sure some magic exists here that will make it just as easy as I described, Dan probably knows.) This is what you don't seem to understand: Once you have the commit id, the branch name has no relevance anymore. You only need the branch name when referencing the head of that branch.
On 13/03/12 20:25, Thomas Bächler wrote:
Am 13.03.2012 09:21, schrieb Allan McRae:
Is it at all feasible to just use the build date for package ordering if it's a vcs package? This way the pkgver can be the ref that was checked out:
- the short hash - the tag - the branch (_short hash maybe OR only automatically bump the pkgrel)
I would really like to move away from the build date having anything to do with package ordering as it is entirely non-deterministic about what is in a package.
I cannot ACK that enough. Please, no build dates!
source=(git://projects.archlinux.org/users/allan/pacman.git) _git_branch="working" _git_commit="8c857343"
Are these looking forward to other VCS's though? What scenario is more than one ref needed? Makepkg would just be doing a checkout right?
This might be me not understanding some of the subtleties of git, but this is what I see as options for building from a git repo...
1) build from master HEAD 2) build from a branch HEAD 3) build from a given commit/tag
So, assuming that the git repo is checked out from upstream in $vcsdir, the command needed for each of these is:
1) git clone --branch master $vcsdir [*] 2) git clone --branch <branch> $vcsdir 3) git clone $vcsdir; git checkout <ref>
You complicate things A LOT here. Specifying the branch and commit separately is nonsense. The commit id already tells you exactly where you want to be. This is the right thing to do:
Just to be clear, you do realize these are not three commands to be run one after the other but the way I was thinking about achieving each of the three usages scenarios I gave above?
$ git clone $vcsdir $ git checkout -b makepkg $ref
Here, $ref can be a tag name, a branch name, a commit id or an abbreviated commit id - everything git understands as a refspec. Let's just give git a single refspec and let it handle the rest.
(I just realized I might be slightly wrong here, as we refer to branch and tag names in a remote repository of a remote repository - but I am sure some magic exists here that will make it just as easy as I described, Dan probably knows.)
This is what you don't seem to understand: Once you have the commit id, the branch name has no relevance anymore. You only need the branch name when referencing the head of that branch.
I do understand the branch name is useless once you have a tag/commit id. That is why later in the email I said it only really serves as documentation. What I was trying to avoid is the need to specify (e.g.) "origin/maint" as the branch. The "origin/" part just seems redundant and the I was trying to remove as much redundancy across PKGBUILDs as possible here. Of course I also see that having separate methods to specify a branch and a ref/tag is redundant too... In the end there is probably no such thing as perfection here. Allan
On 13.03.2012 12:05, Allan McRae wrote:
I do understand the branch name is useless once you have a tag/commit id. That is why later in the email I said it only really serves as documentation.
Use comments for that. -- Florian Pritz
On Tue, Mar 13, 2012 at 7:20 AM, Florian Pritz <bluewind@xinu.at> wrote:
On 13.03.2012 12:05, Allan McRae wrote:
I do understand the branch name is useless once you have a tag/commit id. That is why later in the email I said it only really serves as documentation.
Use comments for that.
So just how do you propose using a comment to describe origin/maint in the pacman repository to build my pacman-maint-git package? I don't follow this logic one bit...you clearly need to use a branch name, not some arbitrary sha1 that never moves forward with new commits... -Dan
On 13/03/12 21:23, Dan McGee wrote:
On Tue, Mar 13, 2012 at 7:20 AM, Florian Pritz <bluewind@xinu.at> wrote:
On 13.03.2012 12:05, Allan McRae wrote:
I do understand the branch name is useless once you have a tag/commit id. That is why later in the email I said it only really serves as documentation.
Use comments for that.
So just how do you propose using a comment to describe origin/maint in the pacman repository to build my pacman-maint-git package? I don't follow this logic one bit...you clearly need to use a branch name, not some arbitrary sha1 that never moves forward with new commits...
I think context is needed here... I said the branch name is only really useful as documentation _if_ we are building for a specified commit. The three cases I see as needing covered are: 1) build from master HEAD 2) build from a branch HEAD 3) build from a given commit/tag What we need to find is the simplest way of allowing these (and other reasonable suggestion that arises...). Allan
Am 13.03.2012 12:33, schrieb Allan McRae:
The three cases I see as needing covered are:
1) build from master HEAD 2) build from a branch HEAD 3) build from a given commit/tag
What we need to find is the simplest way of allowing these (and other reasonable suggestion that arises...).
My point was that from git's perspective, those are not different cases. The only annoying thing is the origin/<ref> for branch names which we need to get around somehow. I will throw in another suggestion which should solve some of those problems: To get the source: cd $vcsdir git clone --mirror $url or cd $vcsdir/$reponame git fetch cd $srcdir GIT_DIR=$vcsdir/$reponame git archive --format=tar --prefix=$reponame/ $refspec | bsdtar -xf - To get the version: GIT_DIR=$vcsdir/$reponame git describe $refspec Disadvantage: - The "checkout" itself is not a git tree, so build() cannot use any git commands (some projects use git commands internally in their Makefiles, so this might be bad) Advantage: - Due to clone --mirror, we have all remote branch names as local branch names, so we don't need to care whether $refspec is a branch name or not. As we don't clone again, we don't lose those names. That disadvantage is big (breaks kernel and syslinux git snapshot versioning), so I am still not convinced. My main point is: We must find a way to make sure we (==makepkg) don't need to care about whether $refspec is a branch, tag or commit.
Am 13.03.2012 04:58, schrieb Allan McRae:
To get the ball rolling again, I think we should pick one VCS system and flesh out what we need and what the prototype PKGBUILD would look like. Then we can move on to the other VCS systems and finally implement it. I guess by the time all the bikeshedding is finished, this will all be done by 2015 and hence the subject line. :)
Thank God - or Allan.
I am going to start with git.
Current: pkgver = the date the package was built _gitroot = the url of the git repo _gitname = the name of the directory we check out the repo into...
According to the man page, _gitname is supposed to be a branch or tag but that is all lies.
One shortcoming: No way to use more than one git repo.
What would we like to have for a flexible git package implementation? - url of the repo - be able to specify the branch/tag/commit to use (and appropriate combinations) - a decent pkgver
- More than one git repo per PKGBUILD (?)
1) URL: There were previous patches to the mailing list that never really got finished, but I think we were fairly happy with this syntax:
source=(git://projects.archlinux.org/pacman.git) source=(git@@http://projects.archlinux.org/git/pacman.git)
Does it make any sense to allow the "::" syntax here? i.e. source=(git@@dirname::http://projects.archlinux.org/git/pacman.git)
where dirname is the name of the directory it checked out into? I am thinking we should probably do the checkout into $vcsdir=$startdir/vcs, so this would only be needed if #1 - we supported multiple VCS checkouts in one PKGBUILD there were two that wanted to used the same name... #2 - another source file conflicted
#2 is readily dealt with and I am not sure we should allow #1 (see below) so I would vote to skip it until a genuine need is shown.
When using tarballs, we just extract them, and we don't care if two tarballs extract into the same directory. We should make this consistent (and thus equally stupid): If we check out pacman.git, call the directory pacman/. I didn't even know the :: syntax, but: The :: syntax is for downloaded file names, not for renaming the directory we extract into. For consistency, it should not be supported here.
2) Specifying commit to work with: I think that this is the difficult bit... the syntax with the source array is already convoluted enough, so I do not think they should be added there. So that suggests we go with assigning them to variables like _git_branch, _git_commit, _git_tag... etc.
But what if we have two git sources? For example, say pacman allowed building against an internal copy of libarchive if a folder named libarchive was found in its root directory. So:
source=(git://projects.archlinux.org/pacman.git git://github.com/libarchive/libarchive)
build() { cd $srcdir/pacman ln -s ../libarchive ./autogen.sh ...
It might seem a somewhat convoluted example, but (e.g.) gcc does allow in source tree building of many of its dependencies. The question is should we consider this outside the realms of the reasonable and state one VCS repo should be one package. I'd say 99.999% of VCS PKGBUILDs (at a lower bound) would never use two VCS sources... and the ones that do need this could do manual checking out of the non-main source within the PKGBUILD anyway.
I am all for consistency here (again) - we allow multiple tarballs as source, so we should also allow multiple VCS repos as source. And this would force us to put the commit name into the URL in the source array. I don't see the real problem here, we could just have a nice syntax for git URLs. What would be the problem with $VCS:$URL@$COMMIT? GIT:git://projects.archlinux.org/pacman.git@maint SVN:ssh://svn.foo.com/bar@12345 It's readable and intuitive. (The @ might be a problem when you put user names in the URL, but we could just use @@ or so.)
3) pkgver Use output of "git describe" (with a s/-/_/) and fall back to "git rev-list HEAD | wc -l" (with a trailing commit id added) if there are no tags in the repo.
'git describe' is nice as it sorts well with vercmp (if the tag names are friendly enough). With my proposal about allowing arbitrarily many VCS sources, we should just make it a convention to use the first VCS source for the version - or make it configurable in pkgver.
PROTOTYPE:
pkgname=pacman-git pkgver=AUTO ... source=(git://projects.archlinux.org/users/allan/pacman.git) _git_branch="working" _git_commit="8c857343"
build() { cd $srcdir/pacman ./autogen.sh ... }
Obviously, I don't like that, as pointed out above. I do like the pkgver=AUTO - makepkg should leave the pkgver alone if it isn't set to AUTO.
What makepkg does: 1) goes into $vcsdir, checks for the pacman directory - if not present, do the git checkout - if present, enter and do a "git pull" unless --holdver is specified
This should be a path under SRCDIR. This should also be a bare clone without working copy.
2) enters $srcdir, and does the appropriate clone of the repo in $vscdir to be at the required branch/tag/commit 3) starts build() etc...
ACK.
Oh wow... you are still reading this?
Hell yes!
Well then, you are now qualified to comment on the proposal.
Thanks, thou shallt not be disappointed.
On 13/03/12 20:13, Thomas Bächler wrote:
What makepkg does:
1) goes into $vcsdir, checks for the pacman directory - if not present, do the git checkout - if present, enter and do a "git pull" unless --holdver is specified This should be a path under SRCDIR. This should also be a bare clone without working copy.
Query, do you mean $srcdir or $SRCDEST there? Also, do you mean a bare (--bare) clone or a shallow (--depth 1) clone? As far as I know, a bare clone can not be updated. I am also not convinced about shallow clones... For glibc, (~20,000 commits), this save a little less that 20% of the size. Apparently, for the entire GNOME git repos, it saves only 10%. That comes at a disadvantage of not being able to clone it into $srcdir (which uses hardlinks when possible so saves space). Well, there is a hack that allows you to do so but it could break at any time... Allan
Am 13.03.2012 11:36, schrieb Allan McRae:
On 13/03/12 20:13, Thomas Bächler wrote:
What makepkg does:
1) goes into $vcsdir, checks for the pacman directory - if not present, do the git checkout - if present, enter and do a "git pull" unless --holdver is specified This should be a path under SRCDIR. This should also be a bare clone without working copy.
Query, do you mean $srcdir or $SRCDEST there?
Ehm, SRCDEST, sorry :)
Also, do you mean a bare (--bare) clone or a shallow (--depth 1) clone? As far as I know, a bare clone can not be updated.
Of course a bare clone can be updated (man git-fetch). Only git pull won't work. Also, regarding the problem in my other mail: git clone --mirror might be the right thing.
I am also not convinced about shallow clones... For glibc, (~20,000 commits), this save a little less that 20% of the size. Apparently, for the entire GNOME git repos, it saves only 10%. That comes at a disadvantage of not being able to clone it into $srcdir (which uses hardlinks when possible so saves space). Well, there is a hack that allows you to do so but it could break at any time...
Shallow is not useful, you might want to check out old commits or different branches.
On 13/03/12 20:39, Thomas Bächler wrote:
Am 13.03.2012 11:36, schrieb Allan McRae:
On 13/03/12 20:13, Thomas Bächler wrote:
What makepkg does:
1) goes into $vcsdir, checks for the pacman directory - if not present, do the git checkout - if present, enter and do a "git pull" unless --holdver is specified This should be a path under SRCDIR. This should also be a bare clone without working copy.
Query, do you mean $srcdir or $SRCDEST there?
Ehm, SRCDEST, sorry :)
Also, do you mean a bare (--bare) clone or a shallow (--depth 1) clone? As far as I know, a bare clone can not be updated.
Of course a bare clone can be updated (man git-fetch). Only git pull won't work. Also, regarding the problem in my other mail: git clone --mirror might be the right thing.
Oh... cool. Learnt something new today. I did not know that could be done. Allan
On 13.03.2012 04:58, Allan McRae wrote:
1) URL: There were previous patches to the mailing list that never really got finished, but I think we were fairly happy with this syntax:
source=(git://projects.archlinux.org/pacman.git) source=(git@@http://projects.archlinux.org/git/pacman.git)
Does it make any sense to allow the "::" syntax here? i.e. source=(git@@dirname::http://projects.archlinux.org/git/pacman.git)
If you want to allow multiple sources of the same project, please don't clone them into multiple directories. You'd just duplicate all objects they share (which could be many). Better cd into the one you already have and simply add a new remote. I don't like the @@ and :: syntax. Maybe "git://remotename//http://projects.archlinux.org/git/pacman.git"? That way we can simply look for "git://" and also easily figure out if there is a remote name or not. "//" should also be a safe delimiter because in the URL it would be collapsed to "/" anyway.
2) Specifying commit to work with: I think that this is the difficult bit... the syntax with the source array is already convoluted enough, so I do not think they should be added there. So that suggests we go with assigning them to variables like _git_branch, _git_commit, _git_tag... etc.
But what if we have two git sources? For example, say pacman allowed building against an internal copy of libarchive if a folder named libarchive was found in its root directory. So:
source=(git://projects.archlinux.org/pacman.git git://github.com/libarchive/libarchive)
source=("git://foo <refspec>" "git://2ndfoo/bar.git <refspec>") I'll just assume git allows urlencoded URLs so even if the git URL contained a space you could make it "%20". This is simple, clear and doesn't introduce any forbidden characters in the URL. If <refspec> is ommited, we should just clone/add the remote and leave HEAD alone. If it is there we checkout. This also means if there are multiple git URLs with the same name the last refspec wins. -- Florian Pritz
On 13.03.2012 04:58, Allan McRae wrote:
1) URL: There were previous patches to the mailing list that never really got finished, but I think we were fairly happy with this syntax:
source=(git://projects.archlinux.org/pacman.git) source=(git@@http://projects.archlinux.org/git/pacman.git)
Does it make any sense to allow the "::" syntax here? i.e. source=(git@@dirname::http://projects.archlinux.org/git/pacman.git)
If you want to allow multiple sources of the same project, please don't clone them into multiple directories. You'd just duplicate all objects they share (which could be many).
Better cd into the one you already have and simply add a new remote. I don't think this was the case anyone was talking about here. More
On Tue, Mar 13, 2012 at 7:41 AM, Florian Pritz <bluewind@xinu.at> wrote: like sources from: git://code.example.com/project.git git://code.example.com/project-extras.git
I don't like the @@ and :: syntax. Maybe "git://remotename//http://projects.archlinux.org/git/pacman.git"? That way we can simply look for "git://" and also easily figure out if there is a remote name or not. "//" should also be a safe delimiter because in the URL it would be collapsed to "/" anyway. Heck no, minus -1000. We are **not** reinventing a URL syntax. This is confusing, misleading, and awful. Look at this: git://remotename//git://projects.archlinux.org/pacman.git
I don't even know what remotename is (but if that can contain a '/', my eyes will really bleed). The double URL specifier (using the same protocol twice!!) is likely not allowable or recommended by http://tools.ietf.org/html/rfc3986. If you are trying to stick with URI/URL format, I'd much rather see git+http:// in this case, which is a lot like the 'svn+ssh://' protocol you can use, etc. And then use fragment identifiers perhaps (RFC 3986, 3.5, '#branchname') to identify the branch name. -Dan
On 13/03/12 21:53, Dan McGee wrote:
On 13.03.2012 04:58, Allan McRae wrote:
1) URL: There were previous patches to the mailing list that never really got finished, but I think we were fairly happy with this syntax:
source=(git://projects.archlinux.org/pacman.git) source=(git@@http://projects.archlinux.org/git/pacman.git)
Does it make any sense to allow the "::" syntax here? i.e. source=(git@@dirname::http://projects.archlinux.org/git/pacman.git)
If you want to allow multiple sources of the same project, please don't clone them into multiple directories. You'd just duplicate all objects they share (which could be many).
Better cd into the one you already have and simply add a new remote. I don't think this was the case anyone was talking about here. More
On Tue, Mar 13, 2012 at 7:41 AM, Florian Pritz <bluewind@xinu.at> wrote: like sources from: git://code.example.com/project.git git://code.example.com/project-extras.git
I don't like the @@ and :: syntax. Maybe "git://remotename//http://projects.archlinux.org/git/pacman.git"? That way we can simply look for "git://" and also easily figure out if there is a remote name or not. "//" should also be a safe delimiter because in the URL it would be collapsed to "/" anyway. Heck no, minus -1000. We are **not** reinventing a URL syntax. This is confusing, misleading, and awful. Look at this: git://remotename//git://projects.archlinux.org/pacman.git
I don't even know what remotename is (but if that can contain a '/', my eyes will really bleed). The double URL specifier (using the same protocol twice!!) is likely not allowable or recommended by http://tools.ietf.org/html/rfc3986.
If you are trying to stick with URI/URL format, I'd much rather see git+http:// in this case, which is a lot like the 'svn+ssh://' protocol you can use, etc. And then use fragment identifiers perhaps (RFC 3986, 3.5, '#branchname') to identify the branch name.
Hmm... good idea using the fragment identifier for specifying the branch/tag/commit to checkout. Being inspired by how pypi uses these to provide checksums, could I even suggest: <url>#branch=maint <url>#tag=v4.0.1 <url>#commit=f42ad345 This gives full clarity to what is being specified an allows for potential cases where branches and tags have the same name and we could provide multiple options, semicolon separated, if needed (obviously not for git, but maybe for another VSC system). Allan
On 03/13/2012 06:15 AM, Allan McRae wrote:
Hmm... good idea using the fragment identifier for specifying the branch/tag/commit to checkout. Being inspired by how pypi uses these to provide checksums, could I even suggest:
<url>#branch=maint <url>#tag=v4.0.1 <url>#commit=f42ad345
This gives full clarity to what is being specified an allows for potential cases where branches and tags have the same name and we could provide multiple options, semicolon separated, if needed (obviously not for git, but maybe for another VSC system).
That makes tag/commit unable to move forward automatically unless you can always specify a branch and then optionally a tag/commit. There are ways to get the children of a commit though; makepkg could move to the tip of the branch if there's only one child and complain if there's more.
On 14/03/12 01:49, Matthew Monaco wrote:
On 03/13/2012 06:15 AM, Allan McRae wrote:
Hmm... good idea using the fragment identifier for specifying the branch/tag/commit to checkout. Being inspired by how pypi uses these to provide checksums, could I even suggest:
<url>#branch=maint <url>#tag=v4.0.1 <url>#commit=f42ad345
This gives full clarity to what is being specified an allows for potential cases where branches and tags have the same name and we could provide multiple options, semicolon separated, if needed (obviously not for git, but maybe for another VSC system).
That makes tag/commit unable to move forward automatically unless you can always specify a branch and then optionally a tag/commit. There are ways to get the children of a commit though; makepkg could move to the tip of the branch if there's only one child and complain if there's more.
Everything there is optional. If you do not specify any of a branch/tag/commit, then makepkg would build from the head of the master branch. If you specify a branch, makepkg builds from the head of that branch. If you specify a commit/tag, makepkg builds from that commit/tag. Allan
Am 13.03.2012 13:15, schrieb Allan McRae:
<url>#branch=maint <url>#tag=v4.0.1 <url>#commit=f42ad345
This gives full clarity to what is being specified an allows for potential cases where branches and tags have the same name and we could provide multiple options, semicolon separated, if needed (obviously not for git, but maybe for another VSC system).
Actually, this seems saner than everything I proposed. I ACK it in principle. One question: Why do you distinguish between tag and commit?
On 14/03/12 02:12, Thomas Bächler wrote:
Am 13.03.2012 13:15, schrieb Allan McRae:
<url>#branch=maint <url>#tag=v4.0.1 <url>#commit=f42ad345
This gives full clarity to what is being specified an allows for potential cases where branches and tags have the same name and we could provide multiple options, semicolon separated, if needed (obviously not for git, but maybe for another VSC system).
Actually, this seems saner than everything I proposed. I ACK it in principle.
One question: Why do you distinguish between tag and commit?
Clarity only. There is obviously no actual difference in how these would be handled. Allan
Dan McGee wrote:
I don't like the @@ and :: syntax. Maybe "git://remotename//http://projects.archlinux.org/git/pacman.git"? That way we can simply look for "git://" and also easily figure out if there is a remote name or not. "//" should also be a safe delimiter because in the URL it would be collapsed to "/" anyway. Heck no, minus -1000. We are **not** reinventing a URL syntax. This is confusing, misleading, and awful. Look at this: git://remotename//git://projects.archlinux.org/pacman.git
I don't even know what remotename is (but if that can contain a '/', my eyes will really bleed). The double URL specifier (using the same protocol twice!!) is likely not allowable or recommended by http://tools.ietf.org/html/rfc3986.
If you are trying to stick with URI/URL format, I'd much rather see git+http:// in this case, which is a lot like the 'svn+ssh://' protocol you can use, etc. And then use fragment identifiers perhaps (RFC 3986, 3.5, '#branchname') to identify the branch name.
Please do this, and modularize the handlers in such a way that users can create new ones without hacking/patching the makepkg source. For example, create a directory similar to /etc/profile.d with files that get sourced by makepkg. Files in this directory could contain functions with scheme prefixes and common names, e.g. for git+http you might have git_http_download and for svn+ssh you would then have svn_ssh_download. The *_download function would be responsible for handling cloning and updating as necessary. It could accept the URL minus the fragments as first argument, the target/build directory as second argument, and then the fragments as additional arguments. Obviously that's just an example off the top of my head and you will have better ideas of how to implement it (e.g. maybe leave parsing to the function). The real purpose it to effectively create a VCS plugin system. Aside from providing users with flexibility, it would also crowdsource the creation of new VCS handlers. They could even be provided as optdepends. For versions, the *_download function could return one or there could be a special *_version function, but mixed sources may be a problem in that case. You could simply make it a guideline/requirement that the version be the date of the last VCS release or source modification. The version of the package would then be the version of the last release/modification of any source. That should simplify version handling across different VCS. I don't recommend using release tags as versions because there will be no way to compare them meaningfully in many (most?) cases. Just some thoughts. Regards, Xyne
On 03/12/2012 09:58 PM, Allan McRae wrote:
blah
I have a patch that I've tested (a little bit) that appears to work... Is there any reason why devel_update() actually needs to do an in-place sed? I replaced it with an eval and its working fine. The function now looks like: if [[ -n $newpkgver ]]; then if [[ $newpkgver != "$pkgver" ]]; then eval $(sed -e "s/^pkgver=[^ ]*/pkgver=$newpkgver/" \ -e "s/^pkgrel=[^ ]*/pkgrel=1/" \ "$BUILDFILE") fi fi
From the git history it looks like it's been in-place since the start, so I can't find any explanation if it exists.
Am 17.03.2012 20:05, schrieb Matthew Monaco:
Is there any reason why devel_update() actually needs to do an in-place sed? I replaced it with an eval and its working fine.
The function now looks like:
if [[ -n $newpkgver ]]; then if [[ $newpkgver != "$pkgver" ]]; then eval $(sed -e "s/^pkgver=[^ ]*/pkgver=$newpkgver/" \ -e "s/^pkgrel=[^ ]*/pkgrel=1/" \ "$BUILDFILE") fi fi
As discussed somewhere else in this thread, the devel build shouldn't modify the PKGBUILD at all.
On 03/17/2012 01:26 PM, Thomas Bächler wrote:
Am 17.03.2012 20:05, schrieb Matthew Monaco:
Is there any reason why devel_update() actually needs to do an in-place sed? I replaced it with an eval and its working fine.
The function now looks like:
if [[ -n $newpkgver ]]; then if [[ $newpkgver != "$pkgver" ]]; then eval $(sed -e "s/^pkgver=[^ ]*/pkgver=$newpkgver/" \ -e "s/^pkgrel=[^ ]*/pkgrel=1/" \ "$BUILDFILE") fi fi
As discussed somewhere else in this thread, the devel build shouldn't modify the PKGBUILD at all.
I know, but it does. It's a gripe of mine. What I'm saying is this appears to work *now* as a drop in change. In the meantime I'm playing with Xyne's great idea of plugins.
On 18/03/12 05:47, Matthew Monaco wrote:
In the meantime I'm playing with Xyne's great idea of plugins.
Just so you know... splitting out parts of makepkg has been discussed quite a bit already. That including solutions for how to enable running makepkg both from within the source tree and when installed. Allan
participants (6)
-
Allan McRae
-
Dan McGee
-
Florian Pritz
-
Matthew Monaco
-
Thomas Bächler
-
Xyne