Re: [arch-dev-public] definition of the core repo, putting nilfs-utils, btrfs-progs and dosfstools in core

22 Dec 2010

      On Wed, 22 Dec 2010 11:01:26 -0600
Aaron Griffin <aaronmgriffin@gmail.com> wrote:
...
On Sat, Dec 18, 2010 at 11:26 PM, Dan McGee <dpmcgee@gmail.com> wrote:
...
On Sat, Dec 18, 2010 at 11:20 PM, Allan McRae <allan@archlinux.org>
wrote:
...
On 19/12/10 04:59, Dieter Plaetinck wrote:
...
Last time the topic "moving nilfs/btrfs utilities in core"
generated some discussion, but no outcome. (and not even all my
questions were answered)
(http://mailman.archlinux.org/pipermail/arch-dev-public/2010-December/018656....)
So I'll try again, and I'll try to be more clear.
question 1:
If you install arch linux on a foo-filesystem, should you install
foo-utils as well? so that at least your system doesn't break
when it needs to run fsck.foo during boot.  IMHO the answer is
yes.
Yes.
...
question 2:
what exacly is the definition of the core repo?
Pierre tells me "The idea of core was to provide a
minimal set of packages that are needed by nearly all users to
set up a base system."  If is true, what separates base from core?
Because this definition looks an awful lot like the definition of
base to me.
AFAIK core is about bundling all packages which are critical (for
the purpose of getting the system to run) for some users,
depending on their particular setup.  This is the reasoning why
we include the core repo on our core install images,
because we say "with this repo, users can install what they need
in order to make their system boot and run fine".
This is also in line with what I see on
https://wiki.archlinux.org/index.php/Official_Repositories ("will
provide you with a fully functional base system")
This is the problem:
- we want to allow users to use new filesystems during
installation. Currently this includes nilfs2 and btrfs.
  (marked as experimental in aif).
- On the other hand we apparently don't allow infrequently used
packages in core. Even if I would wait and only support
  these filesystems when they are declared stable, their package
usage would still be low.  It would be pretty backwards if I
would need to wait for users to configure their systems using the
filesystems manually, so the package usage goes up, before I can
include them on the installation media.
  Currently, dosfstools, nilfs-utils and btrfs-progs-unstable are
not in core and can't be installed by using a networkless install
with the core-images.
So, How do we solve this?  I see two outcomes:
- we add the packages mentioned above to core, and loosen up the
signoff requirements for packages with low usage.
  (because "commonly used" and "needed for a base system" are very
different things)
- we do not add these packages to core, and add something about
"must have high usage" on the wikipage i mentioned above.
  this also means I will need to include a small repo with
"needed for base systems, but with low usage" on the core install
images, to allow users to install these packages.  And I'll need
to adapt aif because it now needs to use additional repositories
to install required-to-boot packages.
I vote for outcome 1.
Basically I don't care what the outcome is here.   But note that
we can probably safely assume that reiserfs is more commonly used
than both the two filesystems that you mention and I have yet to
receive a signoff for reiserfsprogs (posted 2010-12-04...).   So
moving the packages for nilfs/btrfs to [core] will only serve to
delay updates...  Then again, I see no reason why these packages
should not be in [core].
My solution would be more like your #2 suggestion.  Instead of a
"core" CD, make an "install" CD which has a set of packages you
decide on it. This would be largely based on [core], but you can
include packages from [extra] then too.  You can also drop some
[core] packages from the CD. For example, there is no point
putting gcc-{fortran,objc,ada} on the install CD at all. They are
only in [core] because split packages can not go across repos.  So
get rid of them and gain 40MB more space for your dual arch iso.
I'm of the opposite opinion on this one; I think #1 makes more
sense. We're looking at it backwards- "this package can't go in
core because it is hard to get signoffs" vs. "perhaps we can adjust
our signoff policy for lower-usage core packages". Filesystem
utilities, at least to me, seem like the definition of what core
should contain. Reiser/JFS/XFS/BTRFS/NILFS, while not the top
filesystems in use in Linux, deserve to be supported "out of the
box" by us.
For a minor bit of history, "core" is only named the way it is because
it linguistically makes sense with a repository named "extra". core
was never meant to be anything more or less than "current" used to be
- which was a repo full of software that the developers hand-picked to
be important.
That doesn't answer the question, though. Now-a-days, core is treated
as if it were the mission-critical parts of a running system. That's a
good definition, but it also doesn't help answer the filesystem util
question. Are they critical to a running system? Not exactly... once I
have an installed filesystem, I don't need them anymore (it's a good
*suggestion*, but it's not *critical*).
So what do we gain with having the filesystem utils in core? Signoffs,
and in theory a more stable installer. But it already appears that
signoffs for the more esoteric filesystem utils are hard to get. In my
eyes, this seems to make the benefit of gaining signoffs moot.
It seems this question is really about the following: should things
needed during install time be part of core? Or rather, is installation
an important enough part of Arch Linux to require the (theoretical)
added stability of the signoff process?
The signoff procedure is a means to a goal (of providing some
stability/trustworthyness indication for packages with high use);
putting misc low-usage packages in core just to get the advantage of a
free strict signoff procedure will not work, is backwards, and not what
i'm suggesting.
Like I said, I wouldn't want to push these packages into core
without changing the signoff policy.  that would make the job of the
package maintainer needlessly tough, maybe even plain undoable. in
other words, for outcome 1 I would make the signoff policy less strict
for packages with low usage.
but this brings up the question: if we use such a definition for core
(where we say "strict signoff policy for all packages except some"),
which point is there in defining a core repository anyway?

why do we even have separate repositories? strictly speaking, we could
do just fine with 2 repositories. (one for all current packages, one
for all testing packages).  there are some criteria valid enough to
warrant *categorizing* packages (this doesn't necessarily mean putting
in different repositories). 
In our current setup, for example:
gcc is in core, this means it's frequently used so
there are certain quality indications. (maintained by arch dev, signoff
procedure, etc)
claws-mail is in extra, so no strict signoff procedure, but still
sufficient usage to be an officially supported and maintained package.
A community package has even less use, but since it's maintained by a
trusted user there is at least some level of trustworthyness, and I can
still use the bugtracker.
An AUR "package": very low usage, no quality guarantees, no official
support.
How about multilib? this indicates merely a technical property of the
packages, but says nothing about quality indications. (signoff
procedure? maintained by devs or by tu's?)

=> the way we are using repo's is merely to indicate some property of a
package (who maintains it, which level of support, which
quality indication, etc), but they are by itself no reason to physically
separate packages from each other. (although that's how we've been
doing it historically).  Even worse, they are often cross-concerns
making it a bit awkward to classify packages this way (see the multilib
example above)
just thinking out loud; why don't put packages more together in the same
repo, but introduce some new variables per package to indicate these
concerns which separate them?
for example:
maintainer-type : dev/tu/user
support: high (signoffs)/medium (no signoffs)/ none (aur)
this would also trivialize some things which are currently
non-trivial (like moving a package between repo's because a dev
maintains it instead of a TU or vice-versa)
this would also mean it becomes easier to add new "categorizers" (both
as in variablename as in possible values), with very little impact,
basicaly more flexibility.

the only disadvantage I can see with such an approach is that if users
want, say only packages which are currently in core and extra, they
would also get the metadata (from pacman -Sy) from packages they are
not interested in.  but this seems like a small tradeoff to me.

Which raises the question: what *are* valid concerns to *physically*
separate packages by?  Afaik all mirrors have no problem replicating
and serving all repo's, so that's not it.  unofficial repositories and
mirrors could still function pretty much like before, so that's not it
either.
As far as I can see, there are two reasons to physically
separate packages into different repo's:
* space constraints on release media
* testing packages vs stable, because the packages have the same name.

I realise I just suggested some pretty radical changes, but in my mind
they make a lot of sense.  I hope I didn't miss some obvious
downsides (please point them out), I will (and I hope you do too)
spend some nights pondering about this.

Dieter