[aur-general] Official discussion period - Rules governing packages entering [community]

Ivy Foster joyfulgirl at archlinux.us
Fri Dec 5 21:10:47 EST 2008


Hi, all,

I generally just lurk on this list (as I'm not a TU), but figured
I might throw in a couple of cents to this discussion, since there's
some talk about statistics, and I have a strong statistical
background. Oh, and as it turns out, "a few cents" ends up being
something like a dollar. Sorry for the long post, but I'm trying to be
thorough.

Some of the points that have already been made: (Yes, everybody hates
a recap. Deal.)

- The current votes system is insufficient as a valid means of
  determining how many people *really* use a package, because not all
  users vote.

- pkgstats hasn't been around long enough to have generated as much
  information as folks would like. (And, again, not everybody has
  installed/used it).

- A system for reliably working out what packages have a sufficiently
  large user base in Arch to justify inclusion in [community] would be
  a very nice thing to have.

My thought is basically that the best way to get decent usage stats
for a given package is to implement a pkg-download counter (yes,
I know this has been suggested). Unfortunately, this raises some
technical issues. Particularly (read: off the top of my head), we'd
need a mechanism to account for fringe cases such as people who, say,
accidentally remove or break a PKGBUILD and re-download five minutes
later, and the (probably/hopefully much more rare) people who would
repeatedly download a PKGBUILD in order to artificially inflate its
usage statistics. These could probably be taken care of with, say, an
IP-address recorder (but that of course raises privacy issues of its
own) or requiring users to login in order to download (which I *don't*
think is a good idea; it has wa-a-a-a-ay too many obvious drawbacks).
It would also be useful not to double-count people who are merely
updating to a new version, and to have some means of keeping track of
people who have *removed* packages.

Pkgstats is a good start in the direction of keeping reliable
statistics. Some ideas for making that more prevalent are:

- Including a new, well-publicized (so people know they can turn it
  off) function in pacman that would (optionally, default=yes)
  to automagically send in installed/removed pkgs (and I suppose
  a complete list the first time it's run) to something like Debian's
  popularity-contest server.

- Supplementing pkgstats data, where lacking, with data from Debian's
  popularity-contest server.

- Adding a cronjob for pkgstats (again, enabled by default).

- Adding a feature to pkgstats that would also *save* the full list it
  has sent in (preferably to a compressed file, but whatever), and
  then next time it's run, check against the saved list--packages that
  have been removed are removed from this master list, and packages
  that have been added are, well, added to the list. Something like
  a diff would then be sent in.

- Adding pkgstats to the install CD as an optional, but useful,
  package.

So. That's all I can think of on this for now, but if folks would like
statistics advice, I'm on this and a couple arch lists, and feel free
to CC me on the email or whatever just to get my attention. If someone
has specific questions about something or about something I said here,
ask! And I'd be glad to help with implementing...whatever gets
decided.

Thanks for reading this far!

Ivy

P.S.: For what it's worth, I agree that the actual vote should be
tabled until there are one or more solid, researched proposals (which
needn't take more than a week or two!). (And not to downplay the
actual suggestion in question!) And I'd be glad to help out with that,
too, though again I'm not a TU. (-:

-- 
                If I Ever Become An Evil Villainess...
46. If an advisor says to me "My liege, the heroine is but one person.
    What can one person possibly do?" I will reply "This," and kill
    the advisor.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://archlinux.org/pipermail/aur-general/attachments/20081205/54267123/attachment.pgp>


More information about the aur-general mailing list