[aur-general] pkgstats and unused [community] packages

Xyne xyne at archlinux.ca
Tue Oct 26 10:36:29 EDT 2010

On 2010-10-26 08:20 +0800 (43:2)
Ray Rashif wrote:

> On 26 October 2010 07:48, Christopher Brannon <chris at the-brannons.com> wrote:
> > Allan McRae <allan at archlinux.org> writes:
> >
> >> I'd say only remove the packages that are orphans.
> >
> > Here's the list of [community] orphans with less than 1% usage, according
> > to pkgstats:
> > http://paste.xinu.at/98f5
> OK that's a good list, and all of those can be moved IMO.
> I take back part of what I mentioned earlier. There are indeed some
> packages that I believe no one uses. The best way to handle this is to
> selectively remove each package that we still want to keep from the
> wiki list. I've added a filter list, so remove from there (and not the
> original). Wiki diffs would tell us what has been removed (and by
> whom).
> Set up a timeframe along with an official discussion period for this,
> i.e how long we have until the filter list is final. And then the
> voting, if needed.

I can see the point of removing orphans but I still think that using pkgstats
as a metric is a bad idea for everything else. Casual users, i.e. those who are
not actively involved on the forum or IRC won't even be aware of pkgstats.
Really, who installs a distro and actively looks for a way to submit user data?
And please don't try to tell me that the only users who matter are the ones who
form the core community.

Then you have the paranoid who won't submit anything, even if they're a small
group. Ultimately pkgstats only reflect the usage of a small group of people
with possibly skewed interests. (There should be a few statisticians around so
it would be interesting to hear their analysis of this... let's face it, most
people fail at interpret ting statistical data and ultimately do so with a
bias that supports their own agenda... *cough*politicians*cough*.)*

Several of those packages are niche packages too (e.g. python-sympy, vtk,
avogadro), but ones that are important within their niche. If they are actively
maintained then I see no reason to remove them even if they are not commonly
used by the subset of users who submit stats.

As it stands, I would support removal of the orphaned packages listed above but
not the list based on pkgstats alone. We need a better usage metric for repo

Personally I think it would be better to implement a simple online vote and
inform users that a package is a candidate for removal in a post_upgrade or
post_install message. Users could then vote to keep the package and if it
passes a threshold (e.g. 10, as required by AUR), then it does not get removed.

Also, consider that a package can be moved to [community] if it gets 10 votes on
the AUR. 10 votes out of thousands of users is less than 1%, maybe even less
than 0.1% depending on how many AUR users there actually are.


* pkgstats also uses hashed IPs to form unique IDs. Multiple users behind a
  single IP would only count as 1 in that case. What if that single IP
  represents an entire institution with hundreds of installations?

p.s. Removing these packages indiscriminately will herald the apocalypse and the
end of tacos as we know them.

More information about the aur-general mailing list