On Sat, Aug 25, 2012 at 01:36:42PM -0400, Luke Shumaker wrote:
It would support * cvs * git * svn * bzr * hg
But cvs support is yet to be written, and bzr support is half-way written.
Git is the only mode that I'm thoroughly confident in, the others need testing.
Signed-off-by: Luke Shumaker <LukeShu@sbcglobal.net> ---
Code review, as promised. There's a general lack of error checking on important commands. Things that mutate the filesystem or change directory should _always_ be checked for error. I haven't pointed it all out, but you'll get the idea. I still don't like the idea of creating a tarball out of everything, just to unpack it after its completed. From a distribution packager perspective, building from version control shouldn't be a regular thing. Yes, I do understand your decision based on Parabola, but I don't have to agree with it =P.
scripts/.gitignore | 1 + scripts/Makefile.am | 4 +- scripts/vcsget.sh.in | 294 +++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 298 insertions(+), 1 deletion(-) create mode 100644 scripts/vcsget.sh.in
diff --git a/scripts/.gitignore b/scripts/.gitignore index 9e403bf..e59551e 100644 --- a/scripts/.gitignore +++ b/scripts/.gitignore @@ -6,3 +6,4 @@ pkgdelta repo-add repo-elephant
__ '. \ '- \ / /_ .---. / | \\,.\/--.// ) | \// )/ / \ ' ^ ^ / )____.----.. 6 '.____. .___/ \._) .\/. ) '\ / _/ \/ ). ) ( /# .! | /\ / \ C// # /'-----''/ # / . 'C/ | | | | |mrf , \), .. .'OOO-'. ..'OOO'OOO-'. ..\(,
repo-remove +vcsget diff --git a/scripts/Makefile.am b/scripts/Makefile.am index 29c81aa..8e6d037 100644 --- a/scripts/Makefile.am +++ b/scripts/Makefile.am @@ -14,7 +14,8 @@ OURSCRIPTS = \ pacman-key \ pacman-optimize \ pkgdelta \ - repo-add + repo-add \ + vcsget
EXTRA_DIST = \ makepkg.sh.in \ @@ -23,6 +24,7 @@ EXTRA_DIST = \ pacman-optimize.sh.in \ pkgdelta.sh.in \ repo-add.sh.in \ + vcsget.sh.in \ $(LIBRARY)
LIBRARY = \ diff --git a/scripts/vcsget.sh.in b/scripts/vcsget.sh.in new file mode 100644 index 0000000..f8f6e4b --- /dev/null +++ b/scripts/vcsget.sh.in @@ -0,0 +1,294 @@ +#!/bin/bash +# +# vcsget - downloader agent that generates tarballs from VCS repos +# @configure_input@ +# +# Copyright (c) 2012 Luke Shumaker <lukeshu@sbcglobal.net> +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +# gettext initialization +export TEXTDOMAIN='pacman-scripts' +export TEXTDOMAINDIR='@localedir@' + +unset CDPATH + +m4_include(library/output_format.sh) + +## +# usage : in_array( $needle, $haystack ) +# return : 0 - found +# 1 - not found +## +in_array() { + local needle=$1; shift + local item + for item in "$@"; do + [[ $item = "$needle" ]] && return 0 # Found + done + return 1 # Not Found +} + +usage() { + echo "Usage: $0 <URL> <OUTPUT_TARBALL>"
You might consider stuffing $0 into a var sooner as "${0##*/}", similar to program_invocation_short_name in C. Invoking your script as ../some/other/dir/vcsget won't horribly throw off your line wrapping and it looks prettier in the trimmed version, damnit.
+ echo "" + echo "Talks with multiple version control systems (VCSs) to create a" + echo "tarball of a specific commit." + echo "" + echo "<OUTPUT_TARBALL> is a plain, uncompressed tarball. Given the same" + echo "commit, the tarball will always have the same checksum." + echo "" + echo "<URL> is the repository, and possibly commit to download and tar." + echo "The following schemes are recognized:" + echo " * cvs://" + echo " * git://" + echo " * git+PROTO://" + echo " * svn+PROTO://" + echo " * bzr://" + echo " * hg+PROTO://" + echo "" + echo "URLs to be consumed by $0 are not always in the format of the" + echo "relevent VCS program, but normalized to proper URLs. Simply, this" + echo "means:" + echo "" + echo " scheme://[authinfo@]host[:port]/path[#fragment]" + echo "" + echo "Where <authinfo> is in the format \"username[:password]\" and" + echo "<fragment> is the commit ID, branch, or tag to be returned." +} + +## +# Get the column "$2" from the space-delimited string "$1". +# This has the advantage over `set -- $1` and `arr=($1)` that it separates on +# a single space, instead of '\s+', which allows it to have empty columns. +## +get_field() { + local str=$1 + local i=$2 + echo "$str"|cut -d' ' -f $i +} + +## +# Parse a URL into parts, according to RFC 3986, "URI Generic Syntax". +# Sets the variables `scheme`, `authority`, `path`, `query` and `fragment`. +## +parse_url() { + local url=$1 + + # This regex is copied from RFC 3986 + local regex='^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?' + local parts=$(echo "$url"|sed -r "s@$regex@\2 \4 \5 \7 \9@") + + scheme=$( get_field "$parts" 1) + authority=$(get_field "$parts" 2) + path=$( get_field "$parts" 3) + query=$( get_field "$parts" 4) + fragment=$( get_field "$parts" 5) +}
As mentioned, this should all be replaced with bash regex to avoid the constant forking to sed.
+ +## +# Parse an authority from a URL. +# Sets the variables `user`, `pass`, `sock`, `host` and `port`. +# `sock` is socket, which is defined as `hostport` in the RFC. +## +parse_authority() { + local authority=$1 + + local regex='^(([^:]*)(:([^@]*))?@)?(([^:]+)(:([0-9]*))?)' + # 12 3 4 56 7 8 + # 1 = userinfo@ + # 2 = user + # 3 = :pass + # 4 = pass + # 5 = hostport + # 6 = host + # 7 = :port + # 8 = port + local parts=$(echo "$authority"|sed -r "s%$regex%\2 \4 \5 \6 \8%") + user=$(get_field "$parts" 1) + pass=$(get_field "$parts" 2) + sock=$(get_field "$parts" 3) + host=$(get_field "$parts" 4) + port=$(get_field "$parts" 5) +}
More regex. In general, stuffing code into a sed expression is frowned upon (for fear of separator clashes). In this case, you know the value of $regex ahead of time, so it's okay.
+ +download_cvs() { + # TODO + error "$(gettext 'CVS not implemented')" + exit 1 +} + +download_git() { + msg "$(gettext 'Getting source from git')" + : ${fragment:=master} # default to 'master' if not set + send_url="${scheme#git+}://${authority}${path}" + + # Where to checkout the code to + dir="$basedir/git/$send_url"
always localize your variables when possible.
+ if [[ "${dir##*.}" != git ]]; then
Random proper bash test with quoting on the wrong side ;)
+ dir="${dir}/.git" + fi + + # Download + if [ ! -d "$dir" ]; then + msg2 "$(gettext 'doing initial clone')" + mkdir -p "$dir" + git clone --bare "$send_url" "$dir" + fi + + cd "$dir" + refs=($(git show-ref|sed 's@.*/@@'|sort -u))
This is the wrong way to read a command into an array. It'll break on legit whitespace in a ref, and its subject to glob expansion. Addtionally, it breaks refs with slashes in them. Right now, using this with Allan's working-split/binary-path branch will break. I'd use awk and read for this, if you'd like to keep this avenue... IFS=$'\n' read -rd '' -a refs < \ <(git show-ref | awk '{ sub(/refs\/[^/]+\/?/, "", $2); print $2 }') I'll point out that Allan's implementation has the advantage here of forcing that the user provide the _type_ of ref as the URI fragment. In this case, we know what kind of ref that translates to and git-show-ref can do the verification for us via the --verify -q flags.
+ if in_array "$fragment" "${refs[@]}"; then + # The ref is symbolic, so it may have changed upstream + msg2 "$(gettext 'fetching updates')" + git fetch --all
that --all flag seems like you're doing more work than you need to. A repo like linux.git is an extreme case, but could add a fair bit of weight to the call. No error checking?
+ fi + + # Generate tarball + archive=(git archive --format=tar --prefix="$base/" + "$fragment" -o "$tarball")
localize this.
+ msg2 "$(gettext 'trying to generate tarball')" + if "${archive[@]}" 2>/dev/null; then + # success + msg2 "$(gettext 'success')" + exit 0
exit 0 here is redundant, unless you expect that msg2 will fail.
+ else + # the relevent commit is probably newer than we have + msg2 "$(gettext 'failure, forcing an update')" + git fetch --all + msg2 "$(gettext 'generating tarball')" + "${archive[@]}"
Don't hide the command being executed. You've picked an appropriately descriptive name, but I think it'd better to file only the args into an array. It's easier to tell what's being executed at a glance, and it looks more "natural" in terms of command execution, e.g. bsdtar "${archive_args[@]}" In addition, this is the most important part of the function, but you're not doing any error checking.
+ local s=$? + exit $?
I can only assume there's a typo somewhere in here. You're throwing away your return value from the archive command and exiting (invariably) 0.
+ fi +} + +download_svn() { + msg "$(gettext 'Getting source from Subversion')" + parse_authority "$authority" + send_url="${scheme#svn+}://${sock}${path}" + args=('--config-dir' "$basedir/svn/.config")
localize this.
+ if [ -n "$user" ]; then + args+=('--username' "$user")
I just think it's weird seeing a combination of POSIX (single brace) and non-POSIX (arrays) syntax right next to each other =P.
+ fi + if [ -n "$pass" ]; then + args+=('--password' "$pass") + fi + if [ -n "$fragment" ]; then + args+=('-r' "$fragment") + fi + + # Where to checkout the code to + dir="$basedir/svn/$send_url#$fragment"
And localize this... There's more below, I won't point them out.
+ + # Download + if [ ! -d "$dir" ]; then + mkdir -p "$dir" + svn co "${args[@]}" "$send_url" "$dir" + else + cd "$dir" + svn up "${args[@]}" + fi + + # Generate tarball + mkdir "$dir.tmp$$"
If you really need to make a temporary directory, I suggest using mktemp, which can be made to honor the user's TMPDIR var. They'll thank you. but... see below.
+ cp -r "$dir" "$dir.tmp$$/$base" + find "$dir.tmp$$/$base" -name .svn -exec rm -rf '{}' + + cd "$dir.tmp$$" + bsdtar cf "$tarball" "$base"
bsdtar's --exclude flag makes this all unnecessary.
+ rm -rf "$dir.tmp$$" +} + +download_bzr() { + # TODO finish bzr support + error "$(gettext 'Bazaar not implemented')" + exit 1 + + send_url="${url%%#*}" + if [ "$scheme" != 'bzr+ssh' ]; then + send_url="${send_url#bzr+}" + fi + + dir="$basedir/bzr/$send_url" + + # Download + if [ ! -d "$dir" ]; then + msg2 "$(gettext 'doing initial checkout')" + mkdir -p "$dir" + bzr checkout "$send_url" "$dir" + fi + + cd "$dir" + bzr update + + # Generate tarball +} + +download_hg() { + msg "$(gettext 'Getting source from Mercurial')" + send_url="${url#hg+}" + hg clone -U "$send_url" + dir="$basedir/hg/${send_url%%#*}" + + # Download + if [ ! -d "$dir" ]; then + mkdir -p "$dir" + hg clone "$send_url" "$dir" + else + cd "$dir" + hg update -r "$fragment" + fi + + # Generate tarball + mkdir "$dir.tmp$$" + cp -r "$dir" "$dir.tmp$$/$base" + rm -rf "$dir.tmp$$/$base/.hg" + cd "$dir.tmp$$" + bsdtar cf "$tarball" "$base" + rm -rf "$dir.tmp$$"
Needs more error checking... and again with the --exclude.
+} + +main() { + url=$1 + tarball=$(readlink -m "$2")
As mentioned, readlink with flags is bad.
+ + base="$(echo "$tarball"|sed -r 's@.*/(.*)\.tar(\..*)?$@\1@')"
Augh... Who can read that crap? The PE version is so much nicer.
+ basedir="${tarball%/*}/vcs-cache" + + if ( in_array '-h' "$@" || in_array '--help' "$@" ); then
The subshell isn't at all useful here except for increasing your PID count. Just get rid of it.
+ usage + exit 0 + fi + if [ "$#" -ne 2 ]; then + usage + exit 1 + fi + + msg "$(printf "$(gettext "Saving '%s' to '%s'")" "$url" "$tarball")"
I don't think you need that printf there since msg() expects a format string.
+ + parse_url "$url" + case "$scheme" in + cvs) download_cvs;; + git+*|git) download_git;; + svn+*) download_svn;; + bzr) download_bzr;; + hg+*) download_hg;; + *) + error "$(gettext 'Unable to match a VCS with scheme'): $scheme" + exit 1;; + esac +} + +main "$@" >&2
This dumps _everything_ in the script to stderr, putting you on par with gnupg. Don't be like that.
-- 1.7.12