[pacman-dev] [PATCH 0/2] makepkg: Allow placing local sources in subdirectories
Currently, makepkg doesn't support (actually, mishandles) sources in a subdirectory of the package directory. Though there aren't many uses for such a feature, it is useful for an otherwise source-less package of Arch- or site-specific scripts, to be installed on one or a few systems, for which it doesn't make sense to have a separate source repository. In this case, it would be nice to either move the source files in a subdirectory, or allow pulling them from somewhere else on the filesystem. Whether or not this is considered a valid usecase, the current behavior is broken (makepkg computes broken paths and attempts to operate on files that aren't there), so this probably ought to be fixed regardless. I couldn't find a test suite for makepkg (only pacman and makepkg-template), so, here is the test case: mkdir dir touch a dir/b /tmp/c cat > PKGBUILD <<'EOF' pkgname=dir-test pkgver=1 pkgrel=1 arch=('any') source=(a dir/b /tmp/c) md5sums=(SKIP SKIP SKIP) package() { cp "$srcdir"/* "$pkgdir/" ; } EOF One thing to note is that this doesn't protect against filename collisions: specifying dir1/file and dir2/file as sources will just make makepkg clobber the first file. Still, that's not new: source=(http://xxx/a http://yyy/a) are going to have the same problem. Vladimir Panteleev (2): libmakepkg: Fix handling of file paths in get_filepath libmakepkg: Use the correct path in extract_sources scripts/libmakepkg/source.sh.in | 2 +- scripts/libmakepkg/util/source.sh.in | 22 ++++++++++++++++++++-- 2 files changed, 21 insertions(+), 3 deletions(-) -- 2.17.0
With local sources containing a path component, get_filepath used to silently discard the non-filename part of the source URI, which led to uninformative errors (as well as not being able to place source files in a subdirectory). Correct this bug by explicitly adding a case for the local protocol, and appropriately resolve file paths (local or global) to an absolute one usable by extract_file. --- scripts/libmakepkg/util/source.sh.in | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/scripts/libmakepkg/util/source.sh.in b/scripts/libmakepkg/util/source.sh.in index 17e44664..5092347d 100644 --- a/scripts/libmakepkg/util/source.sh.in +++ b/scripts/libmakepkg/util/source.sh.in @@ -85,8 +85,9 @@ get_filename() { # Return the absolute filename of a source entry get_filepath() { - local file="$(get_filename "$1")" - local proto="$(get_protocol "$1")" + local netfile="$1" + local file="$(get_filename "$netfile")" + local proto="$(get_protocol "$netfile")" case $proto in bzr*|git*|hg*|svn*) @@ -98,6 +99,23 @@ get_filepath() { return 1 fi ;; + local) + if [[ "$netfile" == /* ]]; then + # absolute path + if [[ -f "$netfile" ]]; then + file="$netfile" + else + return 1 + fi + elif [[ -f "$startdir/$netfile" ]]; then + # relative path + file="$startdir/$netfile" + elif [[ -f "$SRCDEST/$netfile" ]]; then + file="$SRCDEST/$netfile" + else + return 1 + fi + ;; *) if [[ -f "$startdir/$file" ]]; then file="$startdir/$file" -- 2.17.0
When extracting sources, we need to get the full path to the file being extracted (so that we can operate on it), instead of just the filename component. Previously, the filename coincidentally resolved to the same file as the one we should be working on (due to the working directory being appropriately set), but this assumption no longer holds now that a source may reside in a different directory. Fix this fragile assumption by correctly using the full path instead. --- scripts/libmakepkg/source.sh.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/libmakepkg/source.sh.in b/scripts/libmakepkg/source.sh.in index 02bb16f5..0bdd9602 100644 --- a/scripts/libmakepkg/source.sh.in +++ b/scripts/libmakepkg/source.sh.in @@ -90,7 +90,7 @@ extract_sources() { get_all_sources_for_arch 'all_sources' for netfile in "${all_sources[@]}"; do - local file=$(get_filename "$netfile") + local file=$(get_filepath "$netfile") local proto=$(get_protocol "$netfile") case "$proto" in bzr*) -- 2.17.0
On 04/28/2018 04:31 AM, Vladimir Panteleev wrote:
Currently, makepkg doesn't support (actually, mishandles) sources in a subdirectory of the package directory. Though there aren't many uses for such a feature, it is useful for an otherwise source-less package of Arch- or site-specific scripts, to be installed on one or a few systems, for which it doesn't make sense to have a separate source repository. In this case, it would be nice to either move the source files in a subdirectory, or allow pulling them from somewhere else on the filesystem.
Whether or not this is considered a valid usecase, the current behavior is broken (makepkg computes broken paths and attempts to operate on files that aren't there), so this probably ought to be fixed regardless. Also https://bugs.archlinux.org/task/39718
I'm not sure we have a specified behavior, but I'm personally of the opinion this should be prohibited. IMHO there is no reason to hide files in subdirectories when the entire directory with the PKGBUILD is meant to be the package sources etc. If we did support it officially, we should make sure --source supports it too, and possibly decide whether we support this in the AUR as well. An argument could be made BTW, that these source paths are in fact malformed and should possibly be file://$(pwd)/ urls which are downloaded via curl according to the intent of the PKGBUILD author (these would then be downloaded from a local URI). The logical extension of that line of thought, is we should add a lint_pkgbuild check with a more informative error when no protocol is specified, but a non-filename is used. Yes, file:/// sources work fine. If it isn't specifically local:// then download_file is used, which does DLAGENTS lookup. We don't provide a "file" DLAGENTS protocol, which means we usually just silently exit that function without actually downloading anything... but nothing is stopping people from adding one in the PKGBUILD itself. curl supports downloading from file:///absolute/path/to/file ... I *really* dislike the inconsistency of permitting subdirs for local files (implying we think this is a reasonable use case) but not permitting them for things we resolve as something to download.
I couldn't find a test suite for makepkg (only pacman and makepkg-template), so, here is the test case:
mkdir dir touch a dir/b /tmp/c cat > PKGBUILD <<'EOF' pkgname=dir-test pkgver=1 pkgrel=1 arch=('any') source=(a dir/b /tmp/c) md5sums=(SKIP SKIP SKIP) package() { cp "$srcdir"/* "$pkgdir/" ; } EOF
How would this handle the second case being, instead, local://dir/b ? The proto=local semantic is used both for files that don't have a proto, and files which use the local:// proto. The local:// proto is *actually* used, believe it or not. It's how we specify that the user is expected to provide the file themselves, for e.g. AUR packages which require proprietary sources (like games hidden behind paywalls). The AUR checks the proto, to make sure that no-proto files which *are* expected to be provided together with the PKGBUILD are not missing. ... Why don't people use, for the third case, DLAGENTS+=('file::/usr/bin/curl -C - -o %o %u'); source=('file:///tmp/c') ? And what sources do we expect to have a guaranteed filesystem location, but are not supplied by a package? I consider cp during build/package to be perfectly adequate for files provided by some other package, and in fact I use this here: https://git.archlinux.org/svntogit/community.git/tree/trunk/PKGBUILD?h=packages/broadcom-wl&id=fba4c424ae9bc381f0babc61fca4643e1569de1a#n31 -- Eli Schwartz Bug Wrangler and Trusted User
On 29/04/18 13:12, Eli Schwartz wrote:
On 04/28/2018 04:31 AM, Vladimir Panteleev wrote:
Currently, makepkg doesn't support (actually, mishandles) sources in a subdirectory of the package directory. Though there aren't many uses for such a feature, it is useful for an otherwise source-less package of Arch- or site-specific scripts, to be installed on one or a few systems, for which it doesn't make sense to have a separate source repository. In this case, it would be nice to either move the source files in a subdirectory, or allow pulling them from somewhere else on the filesystem.
Whether or not this is considered a valid usecase, the current behavior is broken (makepkg computes broken paths and attempts to operate on files that aren't there), so this probably ought to be fixed regardless. Also https://bugs.archlinux.org/task/39718
I'm not sure we have a specified behavior, but I'm personally of the opinion this should be prohibited. IMHO there is no reason to hide files in subdirectories when the entire directory with the PKGBUILD is meant to be the package sources etc.
If we did support it officially, we should make sure --source supports it too, and possibly decide whether we support this in the AUR as well.
An argument could be made BTW, that these source paths are in fact malformed and should possibly be file://$(pwd)/ urls which are downloaded via curl according to the intent of the PKGBUILD author (these would then be downloaded from a local URI). The logical extension of that line of thought, is we should add a lint_pkgbuild check with a more informative error when no protocol is specified, but a non-filename is used.
Yes, file:/// sources work fine. If it isn't specifically local:// then download_file is used, which does DLAGENTS lookup. We don't provide a "file" DLAGENTS protocol, which means we usually just silently exit that function without actually downloading anything... but nothing is stopping people from adding one in the PKGBUILD itself. curl supports downloading from file:///absolute/path/to/file
...
I *really* dislike the inconsistency of permitting subdirs for local files (implying we think this is a reasonable use case) but not permitting them for things we resolve as something to download.
I couldn't find a test suite for makepkg (only pacman and makepkg-template), so, here is the test case:
mkdir dir touch a dir/b /tmp/c cat > PKGBUILD <<'EOF' pkgname=dir-test pkgver=1 pkgrel=1 arch=('any') source=(a dir/b /tmp/c) md5sums=(SKIP SKIP SKIP) package() { cp "$srcdir"/* "$pkgdir/" ; } EOF
How would this handle the second case being, instead, local://dir/b ?
The proto=local semantic is used both for files that don't have a proto, and files which use the local:// proto. The local:// proto is *actually* used, believe it or not. It's how we specify that the user is expected to provide the file themselves, for e.g. AUR packages which require proprietary sources (like games hidden behind paywalls). The AUR checks the proto, to make sure that no-proto files which *are* expected to be provided together with the PKGBUILD are not missing.
...
Why don't people use, for the third case, DLAGENTS+=('file::/usr/bin/curl -C - -o %o %u'); source=('file:///tmp/c') ?
And what sources do we expect to have a guaranteed filesystem location, but are not supplied by a package? I consider cp during build/package to be perfectly adequate for files provided by some other package, and in fact I use this here:
Firstly, we do not consider the AUR when making development decisions. Also, I am thinking you are mixing up --source and --allsource. I am happy for a file:// protocol to be added. We already use it for dealing with local packages/repos in pacman, so it should be supported for pacman. In fact, I thought this had already been done, so was surprised when it did not work! As far as the justification for source=("foo/bar") type sources, I think reasonable arguments can be made for their inclusion. For VCS sources, I think a git submodule is a valid example. Also, consider gcc - if you download a source for (e.g.) gmp into a particular directory in its sources, it will compile it and build against that. And the simple example provided by OP of config files organised in different subdirectories relative to the PKGBUILD also seems reasonable. So, sources downloaded to subdirectories is a feature I would support implementing. However, I will not consider partial solutions - or more accurately, partial fixes that look like that will need partially reverted to implement the complete solution. On the top of my head, the complete solution covers: 1) files in subdirectories of the builddir. 2) standard sources 3) VCS sources 4) Sources with foo/bar::http:// type syntax 5) --allsource Allan
On 04/29/2018 06:17 AM, Allan McRae wrote:
Firstly, we do not consider the AUR when making development decisions.
Well, yeah, I'll have to raise that separately on aur-dev based on the outcome of this discussion. (Also it is relevant at least in the sense that the nature of the AUR ensures the local:// protocol gets used in real-world cases.)
Also, I am thinking you are mixing up --source and --allsource.
No, when neither file:// nor local:// are used --source should be adding it as well, which means this applies to that too AFAICT.
I am happy for a file:// protocol to be added. We already use it for dealing with local packages/repos in pacman, so it should be supported for pacman. In fact, I thought this had already been done, so was surprised when it did not work!
Makes sense... I do wonder how many AUR packages incorrectly use file:// (no default DLAGENT handler) instead of local:// (explicitly disables DLAGENT handling) and will therefore break as a result. We'll get more technical correctness out of this, which is neat!
As far as the justification for source=("foo/bar") type sources, I think reasonable arguments can be made for their inclusion. For VCS sources, I think a git submodule is a valid example. Also, consider gcc - if you download a source for (e.g.) gmp into a particular directory in its sources, it will compile it and build against that. And the simple example provided by OP of config files organised in different subdirectories relative to the PKGBUILD also seems reasonable. So, sources downloaded to subdirectories is a feature I would support implementing.
Reasonable enough. I'd forgotten about VCS submodules, this could actually simplify handling a lot, there... But I'm not sure how we'd handle that properly. We use bare clones, so there are potential file/directory name conflicts that would not normally manifest in the git worktree (which relies on the developer layout that explicitly handles that). It's also quite ugly IMO to do bare git clones recursively into each other. BTW we've been asked for this before: https://bugs.archlinux.org/task/39718
However, I will not consider partial solutions - or more accurately, partial fixes that look like that will need partially reverted to implement the complete solution. On the top of my head, the complete solution covers:
1) files in subdirectories of the builddir. 2) standard sources 3) VCS sources 4) Sources with foo/bar::http:// type syntax 5) --allsource
I think a solution would probably look like: 1) add file:// DLAGENTS. As you pointed out, it's useful, a common URI scheme, and should be there regardless. 2) add support for proto:// downloads to subdirectories, requires pre-creating directory layout I think, plus passing the correct paths in download_foo to the DLAGENT. All DLAGENTS should natively support writing to any relative specified path, we just don't use that. 3) ensuring extract_foo handles the requested $srcdir layout. This is where FS#39718 bites us. 4) ensure source packages correctly generate. This should be sufficient to modify: ln -s "$absfile" "$srclinks/$pkgbase" to create the right destination, and mkdir -p the ${destination/*} first. -- Eli Schwartz Bug Wrangler and Trusted User
On 04/29/2018 09:26 AM, Eli Schwartz wrote: This also helps solve https://bugs.archlinux.org/task/40118 -- Eli Schwartz Bug Wrangler and Trusted User
This is a common URI scheme (in general if not in makepg) and we should provide a handler for it. We already allow its use for locally sourced git repositories, so it makes sense to not leave files out. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- etc/makepkg.conf.in | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/etc/makepkg.conf.in b/etc/makepkg.conf.in index fe3858ad..4d6ab78d 100644 --- a/etc/makepkg.conf.in +++ b/etc/makepkg.conf.in @@ -8,7 +8,8 @@ # #-- The download utilities that makepkg should use to acquire sources # Format: 'protocol::agent' -DLAGENTS=('ftp::/usr/bin/curl -gqfC - --ftp-pasv --retry 3 --retry-delay 3 -o %o %u' +DLAGENTS=('file::/usr/bin/curl -gqC - -o %o %u' + 'ftp::/usr/bin/curl -gqfC - --ftp-pasv --retry 3 --retry-delay 3 -o %o %u' 'http::/usr/bin/curl -gqb "" -fLC - --retry 3 --retry-delay 3 -o %o %u' 'https::/usr/bin/curl -gqb "" -fLC - --retry 3 --retry-delay 3 -o %o %u' 'rsync::/usr/bin/rsync --no-motd -z %u %o' -- 2.17.0
On 03/05/18 11:32, Eli Schwartz wrote:
This is a common URI scheme (in general if not in makepg) and we should provide a handler for it. We already allow its use for locally sourced git repositories, so it makes sense to not leave files out.
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- etc/makepkg.conf.in | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/etc/makepkg.conf.in b/etc/makepkg.conf.in index fe3858ad..4d6ab78d 100644 --- a/etc/makepkg.conf.in +++ b/etc/makepkg.conf.in @@ -8,7 +8,8 @@ # #-- The download utilities that makepkg should use to acquire sources # Format: 'protocol::agent' -DLAGENTS=('ftp::/usr/bin/curl -gqfC - --ftp-pasv --retry 3 --retry-delay 3 -o %o %u' +DLAGENTS=('file::/usr/bin/curl -gqC - -o %o %u'
why not use "cp"?
+ 'ftp::/usr/bin/curl -gqfC - --ftp-pasv --retry 3 --retry-delay 3 -o %o %u' 'http::/usr/bin/curl -gqb "" -fLC - --retry 3 --retry-delay 3 -o %o %u' 'https::/usr/bin/curl -gqb "" -fLC - --retry 3 --retry-delay 3 -o %o %u' 'rsync::/usr/bin/rsync --no-motd -z %u %o'
On 05/12/2018 07:17 AM, Allan McRae wrote:
On 03/05/18 11:32, Eli Schwartz wrote:
This is a common URI scheme (in general if not in makepg) and we should provide a handler for it. We already allow its use for locally sourced git repositories, so it makes sense to not leave files out.
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- etc/makepkg.conf.in | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/etc/makepkg.conf.in b/etc/makepkg.conf.in index fe3858ad..4d6ab78d 100644 --- a/etc/makepkg.conf.in +++ b/etc/makepkg.conf.in @@ -8,7 +8,8 @@ # #-- The download utilities that makepkg should use to acquire sources # Format: 'protocol::agent' -DLAGENTS=('ftp::/usr/bin/curl -gqfC - --ftp-pasv --retry 3 --retry-delay 3 -o %o %u' +DLAGENTS=('file::/usr/bin/curl -gqC - -o %o %u'
why not use "cp"?
1) no need to strip the file:/// bit by special-casing file:// like we do scp. Why make file:// filesystem-aware in ways that nothing else is anyway? 2) cp preserves attributes like the executable bit, which http urls specifically don't; cp is therefore inconsistent. 3) cp is silent, which would make it the only DLAGENTS which is by default silent. 4) it's thematically consistent with using the same thing lots of times. I guess we could use /usr/bin/cp --no-preserve=mode -v But this still leaves special-casing file:// which IMHO introduces slightly magic behavior and I don't like using something other than the actual contents of the source=() except as explicitly documented in the PKGBUILD(5) manpage for :: and + -- Eli Schwartz Bug Wrangler and Trusted User
On 13/05/18 12:53, Eli Schwartz wrote:
On 05/12/2018 07:17 AM, Allan McRae wrote:
On 03/05/18 11:32, Eli Schwartz wrote:
This is a common URI scheme (in general if not in makepg) and we should provide a handler for it. We already allow its use for locally sourced git repositories, so it makes sense to not leave files out.
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- etc/makepkg.conf.in | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/etc/makepkg.conf.in b/etc/makepkg.conf.in index fe3858ad..4d6ab78d 100644 --- a/etc/makepkg.conf.in +++ b/etc/makepkg.conf.in @@ -8,7 +8,8 @@ # #-- The download utilities that makepkg should use to acquire sources # Format: 'protocol::agent' -DLAGENTS=('ftp::/usr/bin/curl -gqfC - --ftp-pasv --retry 3 --retry-delay 3 -o %o %u' +DLAGENTS=('file::/usr/bin/curl -gqC - -o %o %u'
why not use "cp"?
1) no need to strip the file:/// bit by special-casing file:// like we do scp. Why make file:// filesystem-aware in ways that nothing else is anyway?
2) cp preserves attributes like the executable bit, which http urls specifically don't; cp is therefore inconsistent.
3) cp is silent, which would make it the only DLAGENTS which is by default silent.
4) it's thematically consistent with using the same thing lots of times.
I guess we could use /usr/bin/cp --no-preserve=mode -v
But this still leaves special-casing file:// which IMHO introduces slightly magic behavior and I don't like using something other than the actual contents of the source=() except as explicitly documented in the PKGBUILD(5) manpage for :: and +
Good enough explanation for me. A
participants (3)
-
Allan McRae
-
Eli Schwartz
-
Vladimir Panteleev