[Forwarding this from Xyne]
This is an area where I have a pretty solid experience, as as perl dev, arch user, and maintainer of several perl packages in extra and community. I tend to agree with Allan here. We should only package and take into consideration what on CPAN is called a distribution, i.e. a tarball containing a bunch or related perl modules. The mapping between modules and distributions is available in a plain text database on each CPAN mirror, and can be figured out by the common tools used to install cpan stuff from the command-line (cpan and cpanp). I think it is quite a BAD idea to put all module names in the provides array, as this can easily yield hundreds of elements without obvious advantage I can think of. Our poor lil Pacman has better things to do than take that overload into account.
Is the overhead even significant? How expensive are PROVIDES lookups? (Coincidentally, I've thought for a while now that there should be a single PROVIDES list for the local database to avoid opening hundreds of files unnecessarily, even if that is quite fast). What about when distributions change their module roster? This has happened before and will happen again. I see robust dependency resolution as an "obvious advantage". If any important modules ever get moved around (e.g. subsumed into the base distribution from a popular module), then all packages which depend on that module would need to update their depends array. The fact that META.yml files for distributions on CPAN specify modules and not distributions shows that the dependencies are actually the modules themselves. Ignoring upstream convention and Pacman's built-in capabilities to shave a little bit of time off of a PROVIDES lookup doesn't right to me, especially when you factor in the potential dependency breakage, however unlikely that is to affect large distributions.
That said, I do acknowledge Xyne's effort! I have written a very similar tool years ago, which is still available in AUR (perl-cpanplus-pacman), but for which I have alas not dedicated as much effort and commitment as Xyne did with pacpan. My own approach has however inspired another project, called perl-cpanplus-dist-arch (also in AUR), which IMHO is superior to both my own cpan4pacman and Xyne's pacpan. (That said I still use cpan4pacman (together with a few helper shell scripts and the devtools) to maintain my own local repository of 550 CPAN packages, all of which I keep uptodate with relative ease).
I've looked at perl-cpanplus-dist-arch while rewriting the pacpan backend. I found the entire CPANPLUS backend to be overkill for creating Pacman packages. Pacpan uses the same files and gets the same results, but without the CPAN shell and other bells and whistles that have no significance for Pacman packaging. That said, if there are particular features that perl-cpanplus-dist-arch has that pacman lacks, let me know which and I will consider adding them. If that project does show itself to be superior for Pacman packaging then I probably switch the backend over to it, but so far the current backend works as expected and is fairly simple with no external dependencies. Two follow-up questions: Do you at least support the idea of renaming CPAN packages with non-standard names to their standard names? If not, what about including the standard name in the provides array? If pacpan didn't include the individual modules in the arrays but instead mapped then all to their distribution, how would you see the matter then? Regards, Xyne