[arch-dev-public] noarch packages vs FTP & DB

Roman Kyrylych roman.kyrylych at gmail.com
Thu Nov 29 05:54:27 EST 2007


2007/11/28, Aaron Griffin <aaronmgriffin at gmail.com>:
> On Nov 27, 2007 4:56 PM, Jason Chu <jason at archlinux.org> wrote:
> > > > * Architecture Independent Repos
> > >
> > > More correct: Architecture Independent Packages
> > > They were intended to be placed in the same repos as i686/x86_64
> > > packages, and in the same db.tar.gz
> > > They can be placed in $repo/os/any/ dir on FTP servers though (to save
> > > the space and traffic for mirrors)
> > > with $repo/os/{i686,x86_64}/$pkgname-$pkgver-$pkgrel-any.pkg.tar.gz
> > > being a symlink to ../any/$pkgname-$pkgver-$pkgrel-any.pkg.tar.gz
> > > (this is because packages should be in the same dir as .db.tar.gz for
> > > pacman to download them).
> >
> > Are you sure about that?  I thought they were going to be separate repos
> > that would be usable in x86_64 and i686.  That way we'd save bandwidth
> > (upload and download) and storage space *and* save time building packages.
>
> I could go either way here - I don't have a strong opinion, but jason
> has a point about size (bandwidth, disk space, etc).
>

They cannot be separate _repos_, because the only way to get this
clearly would be to create [core-any], [extra-any] etc, which is bad.

They can/should/will be in _separate_FTP_dir_ though (which is not the
same as a separate repo)
- in $ftpurl/$repo/os/any, but here are some issues (one of which I
didn't notice when I first proposed noarch packages):

How pacman downloads packages from a repo:
for example, on i686 system this is in pacman.conf:
    [somerepo]
    Server=ftp://someurl/somerepo/os/i686/
Pacman searches for somerepo.db.tar.gz there and for every package
listed in that db it tries to download a package _from_the_same_dir_.


Solution #1:

1) noarch packages are mentioned in both i686 & x86_64 .db.tar.gz

2) just create a symlink from
$repo/os/{i686,x86_64}/$pkgname-$pkgver-$pkgrel-any.pkg.tar.gz
to ../any/$pkgname-$pkgver-$pkgrel-any.pkg.tar.gz
so noarch packages will be stored only in one place on ftp - $repo/os/any,
but will be downloaded in the same way as other (arch-specific)
packages for user's architecture.
Note that there is no $repo.db.tar.gz in $ftpurl/os/any.

3) mirrors should sync with symlinks (I believe most of them already
do this after the current->core move).

4) if someone wants to create a mirror for i686 packages only (for example)
- he/she can simplyfy it by using rsync's option to store symlinked
file in the place of symlink.
So user will get *-{any,i686}.pkg.tar.gz files mixed in a single
~/mirror-for-my-i686-box/$repo dir.

This doesn't require any change to pacman or pacman.conf, just
ensuring that mirror is getting the $repo/os/any dir too (which is not
a problem since most mirrors do a full FTP rsync).


Solution #2:

1) Pacman 3.1 already has only one /etc/pacman.d/mirrorlist file for
all repos with contents like:
  Server = ftp://ftp.archlinux.org/$repo/os/i686
and in pacman.conf there is:
  [core]
  Include=/etc/pacman.d/mirrorlist
Pacman replaces '$repo' with 'core' automatically during a sync operation.

2) Add a similar support for $arch:
  Server = ftp://ftp.archlinux.org/$repo/os/$arch
On i686 machine pacman should replace '$arch' with 'i686' and then with 'any'.
This way we will have 3 different .db.tar.gz for ftp dir, including
$repo/os/any/$repo.db.tar.gz
And we there will be no symlinks in $repo/os/{i686,x86_64} dir.

3) For cross-distro usage it's better not to hardcode $arch=($CARCH
'any') but make it cofigurable (at least at build time).

4) Modify pacman's output so it produces a single progress meter for a
repo, not 2 for each 'i686' and 'any' db file.

5) extract 'i686' and 'any' dbs to a single /var/lib/pacman/sync/$repo
but then it should guess from which ftp dir to download package
(taking -prefix into account again? or store arch info in db?)


The difference between 2 solutions is:
#1 - still one db per repo per arch, still simple repo-syncing and
package downloading
(no modifications in pacman & pacman.conf needed)
FTP structure:
  $ftpurl/$repo/os/i686/ :
    $repo.db.tar.gz
    $pkgname-$pkgver-$pkgrel-i686.pkg.tar.gz
    $pkgname-$pkgver-$pkgrel-any.pkg.tar.gz ->
../any/$pkgname-$pkgver-$pkgrel-any.pkg.tar.gz

  $ftpurl/$repo/os/x86_64/ :
    $repo.db.tar.gz
    $pkgname-$pkgver-$pkgrel-x86_64.pkg.tar.gz
    $pkgname-$pkgver-$pkgrel-any.pkg.tar.gz ->
../any/$pkgname-$pkgver-$pkgrel-any.pkg.tar.gz

  $ftpurl/$repo/os/any/ :
    $pkgname-$pkgver-$pkgrel-any.pkg.tar.gz


#2 - 2 dbs per repo per arch, pacman has to download/unpack/merge 2
dbs and download packages from 2 dirs.
FTP structure:
  $ftpurl/$repo/os/i686/ :
    $repo.db.tar.gz
    $pkgname-$pkgver-$pkgrel-i686.pkg.tar.gz

  $ftpurl/$repo/os/x86_64/ :
    $repo.db.tar.gz
    $pkgname-$pkgver-$pkgrel-x86_64.pkg.tar.gz

  $ftpurl/$repo/os/any/ :
    $repo.db.tar.gz
    $pkgname-$pkgver-$pkgrel-any.pkg.tar.gz


While FTP structure in #2 seems cleaner - the implementation is
complex and brings more pain than goodness.
So I strongly prefer #1 and am going implement it in our db scripts
and devtools/aurtools as soon as time permits.

-- 
Roman Kyrylych (Роман Кирилич)


More information about the arch-dev-public mailing list