Re: [arch-dev-public] Fwd: Re: Fwd: Perl packaging guidelines.

10 Feb 2010

      Forwarding again

---------------------------------------

Hi Firmicus,

Thanks for forwarding my previous message. Here's the follow-up.

Firmicus wrote:
...
...
...
...
Two follow-up questions:
 Do you at least support the idea of renaming CPAN packages with
 non-standard names to their standard names?
Example?
...
...
...
...
If not, what about including the standard name in the provides array?
Makes sense.
glib-perl ->  perl-glib
gtk2-perl ->  perl-gtk2
perlio-* ->  perl-perlio-*

These are just a few examples from my system. I'm sure that there are
others. At least we agree that the "proper" name should be provided if
it doesn't match the pkgname.
...
...
...
...
If pacpan didn't include the individual modules in the arrays but
 instead mapped them all to their distribution, how would you see the
 matter then?
Not sure I understand...
pacpan maps all modules to their distributions. If I removed modules
from the provides|depends|makedepends|optdepends arrays and replaced
them with their corresponding distributions, would you support the
recommendation of using pacpan as a starting point for packaging CPAN
packages for pacman?

This would still provide the advantage of using upstream metadata to
create the default PKGBUILDS (including depends, makedepends,
optdepends and provides). I think specifiy modules explicitly is the
better way, but I think this would be better than nothing.

Ignore this for now actually. I don't want to split the discussion into
one about renaming existing packages. That's probably worth its own
thread and will just mix up the arguments for and against pacpan.
...
Back to the issue of the provides array: I do see the advantage. My
 point is that in many cases, including ALL modules in there leads to
 things like this:
 http://aur.archlinux.org/packages/perl-kiokudb/perl-kiokudb/PKGBUILD
 which looks insane to me! And do we really want the PKGBUILD of
 perl-datetime-timezone to provide all modules listed here:
 http://search.cpan.org/~drolsky/DateTime-TimeZone-1.10/ ?
In comparision the traditional approach is cleaner, for instance here:
 http://aur.archlinux.org/packages/perl-catalyst-runtime/perl-catalyst-runtim...
 (for which pacpan would have created a very long string of modules in
 the provides array).
I don't really see the problem with this. If you had to enter that by
hand then yes, it would be tedious, but generating such arrays
programmatically eliminates that issue. To me this is like arguing
against the inclusion of all locales in locale.gen because it makes the
list too long and nobody uses most of them. I just don't see that as a
real argument.
...
Note that in the above two examples all dependencies are versioned,
 which makes Allan fume :) More often than not, this introduces needless
 problems, so yes, Allan is right on this. OTOH there are many
 "bleeding-edge" modules on CPAN than do require very recent versions of
 other modules to work properly, so there would be a clear downside in
 getting rid of them...
There are enough distributions/modules in CPAN that specify
specific versions of modules that I feel their inclusion in the
provides array is justified. If pacman's provides array isn't meant to
list the packages and their versions provided by a package, then what
is it meant for?

The only real argument I can see here is one of overhead, but isn't
most of the overhead in opening the file? How much more is there from
reading a few extra lines?
...
Perhaps I am just being too conservative... I do understand Xyne's
 point: if the module Catalyst::Foo::Bar in the distribution
 Catalyst-FooStuff which is packaged as perl-catalyst-foostuff were
 eventually to become part of perl-catalyst-runtime, then having it in
 the provides array would indeed be of some help. But in real life such
 situations occur very rarely. And this extra metadata in the PKGBUILD is
 convenient mainly for the ideal situation where everything is fully
 automated, which is not very realistic, as errors also creep in the CPAN
 metadata and human beings still have to fix those things manually. The
 human packager should review the generated PKGBUILDs anyway.
That's why I've used the term "starting point" throughout this
discussion. I know that CPAN's metadata is not always correct and that
the PKGBUILDs created using that metadata thus require intervention
sometimes. Most of the time though the PKGBUILD is fine, and in the few
cases when it isn't, it usually only requires a few tweaks. It very
rarely fails miserably (in some cases when it does, it even includes a
prominent warning in the generated PKGBUILD.

That said, the provides array never fails because it's generated from
the database that I create, which in turn is generated by mapping
modules to common source files and thus distributions. Thus in the
event that the PKGBUILD is incomplete, the packager doesn't have to
manually enter the provides. The missing information is the information
that he/she would have had to enter without pacpan anyway.

So far these are the pros and cons that I see of the proposed approach:

PROS:
*) robust and explicit dependency resolution
*) follows CPANs convention (they specify modules as dependencies, not
distributions)
*) uses pacman's built-in functionality
*) enables pacman's searches to find modules
*) automated (intervention will still be required occasionally)

CONS:
*) slight increase in overhead for pacman (negligible?)
*) PKGBUILD arrays will be huge in some cases

Regards,
Xyne