[aur-general] My thoughts on the [Community] policy discussion

Sun Nov 30 14:34:29 EST 2008

On Sun, Nov 30, 2008 at 4:00 AM, Timm Preetz <timm at preetz.us> wrote:
> On Sat, 2008-11-29 at 23:38 -0800, Drew Frank wrote:
>> 2. Both vote counts and pkgstats statistics, as they are currently
>> generated, are flawed metrics of package popularity.
>>
>> 3. Nearly all of the controversy over various [Community] admission
>> policies stems from the fact that different people place different
>> levels of trust in the information provided by votes and pkgstats.
>>
>> I feel like I'm a bit late to the party, but unless there is a
>> critical need to act now my recommendation would be to first set up a
>> system whereby accurate usage statistics are obtained and then proceed
>> to develop policy based on accurate, interpretable statistics.
>
> How do you want to measure the usage correctly?

Good question :).  Unfortunately, I can't say I have the answer, but
with so many clever folks here I'm confident we could figure something
out.

> Then we would need a community-layer to pacman wo notifies AUR whenever
> a package from community is installed or deleted.

Seems reasonable.

> But that won't happen due to privacy concerns.

I admit, my experience dealing with this issue is quite limited.  How
exactly would this violate user privacy?  Could we not anonymize the
data before storing it?

There are other potential solutions I've seen discussed, such as
asking everyone to run pkgstats every x months.  This also sounds
plausible, but has the drawback of being less up-to-date and relying,
like voting, on users to do as they're asked.  It could be
pre-installed as a cron job, but some folks like to disable cron.
Still, I think it has potential.

The "pacman community-layer" idea has the advantage that no extra work
is required on the part of users, and to me (besides the privacy issue
which I don't yet fully understand) seems like the best option.  In
some sense, it would be just an automated, anonymous, forced
"vote"/"unvote" mechanism.

Again, I am mostly just trying to direct attention toward what seems
to me like a very solvable problem.  People have talked about creating
a "more optimal" admission policy -- it seems to me that any policy
based on imperfect information will have issues, and thus the first
question we need to answer is the one you brought up: how can we
measure usage correctly?  I worry that a system that relies too much
on incorrect usage information will necessitate manual corrections and
special-cases and cause headaches down the road.

Drew