On Sun, Nov 30, 2008 at 4:00 AM, Timm Preetz <timm@preetz.us> wrote:
On Sat, 2008-11-29 at 23:38 -0800, Drew Frank wrote:
2. Both vote counts and pkgstats statistics, as they are currently generated, are flawed metrics of package popularity.
3. Nearly all of the controversy over various [Community] admission policies stems from the fact that different people place different levels of trust in the information provided by votes and pkgstats.
I feel like I'm a bit late to the party, but unless there is a critical need to act now my recommendation would be to first set up a system whereby accurate usage statistics are obtained and then proceed to develop policy based on accurate, interpretable statistics.
How do you want to measure the usage correctly?
Good question :). Unfortunately, I can't say I have the answer, but with so many clever folks here I'm confident we could figure something out.
Then we would need a community-layer to pacman wo notifies AUR whenever a package from community is installed or deleted.
Seems reasonable.
But that won't happen due to privacy concerns.
I admit, my experience dealing with this issue is quite limited. How exactly would this violate user privacy? Could we not anonymize the data before storing it? There are other potential solutions I've seen discussed, such as asking everyone to run pkgstats every x months. This also sounds plausible, but has the drawback of being less up-to-date and relying, like voting, on users to do as they're asked. It could be pre-installed as a cron job, but some folks like to disable cron. Still, I think it has potential. The "pacman community-layer" idea has the advantage that no extra work is required on the part of users, and to me (besides the privacy issue which I don't yet fully understand) seems like the best option. In some sense, it would be just an automated, anonymous, forced "vote"/"unvote" mechanism. Again, I am mostly just trying to direct attention toward what seems to me like a very solvable problem. People have talked about creating a "more optimal" admission policy -- it seems to me that any policy based on imperfect information will have issues, and thus the first question we need to answer is the one you brought up: how can we measure usage correctly? I worry that a system that relies too much on incorrect usage information will necessitate manual corrections and special-cases and cause headaches down the road. Drew