[aur-general] Proposed rules for packages entering [community]
Ondřej Kučera
ondrej.kucera at centrum.cz
Thu Dec 4 19:37:17 EST 2008
Hello,
> We have mirrors. Almost 100 of them. Feel free to contact them all,
> have them write code to count downloads which then sends the stats to
> us, and then we can implement this.
>
> What you suggest is absolutely not feasible at all.
That's too bad, I wanted to suggest counting of downloads too (because I
believe that the number downloads of particular version of a package
would after a while correlate quite well with the number of users that
actually use, i. e. upgrade this package - it should more or less solve
the problem of people trying the package and removing it quickly after
that that was mentioned).
Anyway I've been meaning to contribute with some ideas for the topic for
at least four days (since I read the first IRC log on Sunday),
unfortunately my job hasn't allowed it this week. I just wanted to do
some thinking out loud about both methods (voting/pkgstats) for both
packages already in community and those that might get there in the
future from a regular user's point of view (also with regards to
privacy/paranoia matters).
(1) pkgstats
The obvious problem with accuracy is that not everybody will use it (or
use it even from time to time to update their "contribution" to the
statistics). Some people don't know about it, some people won't be
bothered, some might be concerned about privacy. Even though IP address
is not necessarily an identifier of a person, it still a "good enough
information". I actually more or less trust Arch devs that really only a
hash of the IP is stored together with the package list but I hardly can
be sure and there are much more paranoid users out there than myself.
(Their problem doesn't have to be only with privacy itself - when
someone knows the packages you use and even the exact versions, it makes
it so much easier to target some kind of attack on the system.)
On the other hand it can be nicely used to promote a package that is in
unsupported. "Do you use this package? Do you want to see it in
community? Have you run pkgstats on you system then?" It would be nice
to see the statistics in AUR frontend, one could see how far the package
is from the magic number that makes the package a good candidate for
community (whatever the number will be).
As for pruning of community as it is now (if it still is an issue, I'm
not quite sure anymore). How about this. Pick a reasonable percentage
(it doesn't have to be the same number as the one for new packages
entering community, it can be lower) by whatever criteria (number of
packages to prune, number of MB to save, ...), create a list of all the
packages with usage below this number and create lists of these packages
grouped by their maintainers. Then send the individual maintainer-lists
to the maintainers with a note that they should consider whether or not
these particular packages are really a good material for community. At
the same time put the list of all those packages on the web, announce
its existence in the latest news and tell people that if they see a
package/packages they use and haven't yet run pkgstats, they should
probably do it now, otherwise the package might be removed from
community. Then wait for some time and look at the change in statistics
(maybe there will be some, maybe there won't).
(2) votes
Again, not everybody uses it. Especially since voting means that you
have to have an AUR account. Today everybody has tons of accounts at
different internet services, ideally one should have as many passwords
as possible, and people don't like to create yet another account (I know
I don't). Frankly, if I hadn't needed those about 15 packages I now
maintain in unsupported (because I hadn't found them there), I wouldn't
have created an AUR account either.
There's another problem with accuracy. Even users who have an account
and vote don't vote for every single package they use. Especially many
people (myself included) probably never voted for packages already in
community. This makes the system usable for dealing with the transition
unsupported -> community but not for the other way round. That, too,
could be helped by similar approach as above - count packages with the
least votes, create their list (lists) and urge people to vote for
packages on this list if they use them a want to see them still in
community in the future.
The problem is that this way the privacy concerns will be even bigger.
Right now if someone looked up which packages I voted for, it wouldn't
give them much of an idea which packages I actually use (because I only
voted for packages in unsupported and only for those that I had a reason
to believe that my vote might help push them to community). After
applying the above suggestion, anyone who gained access to AUR data
knows more or less about all community packages that a certain nickname
uses (which is much worse that knowing that this list of packages is
used by someone with this hash of IP address - which is the information
pkgstats provides). Moreover, each nickname is associated with an e-mail
which is then more or less associated with a particular person. Of
course, the e-mail can be fake (or completely or almost unused), on the
other hand if you also want to maintain some packages in unsupported,
you want to have a valid e-mail, so, if you're paranoid, you'd probably
have to have two AUR accounts - one connected to you for maintaining
packages and the other one as "anonymous" as possible just for voting.
Conclusion
Unfortunately, I don't have a solution. Both systems can be made more
accurate (and useful for pointing fingers at packages that really aren't
all that much used) but at the price of some amount of privacy or even
security. I still think that the best solution would be counting
downloads, because it would be quite accurate and also quite anonymous
(definitely more than pkgstats or voting) but sadly it's not an option.
I hope I haven't wasted too much time of those who have read it all. If
so, then I apologize :-), but I felt that when I spent some the time
thinking about these matters on my way to work and back this week, I
should share the thoughts.
Ondřej
--
Cheers,
Ondřej Kučera
More information about the aur-general
mailing list