On Fri, Sep 10, 2010 at 10:52 AM, Pierre Schmitz <pierre@archlinux.de> wrote:
Hi all,
two years ago (http://www.archlinux.org/news/419/) I created a stupid script to get some stats about package usage from our users.
I did some improvements. The client now submits the mirror used by pacman. On the server side statistics about the country (looked up by geoip) are stored and more important all data are stored with a timestamp. This way we should be able to see usage trends etc.. It would also be possible to have users submit data regularly without messing up all stats.
For now please have a look at the client itself. Check the output of "-s" and make sure you are using 2.0-2; especially if you use a password in your mirrorlist. ;-) The results can be found at https://www.archlinux.de/?page=PackageStatistics It's quite rough and needs a lot of optimization. It also just shows the overall stats.
Note, that my goal is to collect data that are actually useful to us and not just nice to see. If you have any suggestions or ideas let me know. I think once this is working properly we should make another announcement to get helpful stats for a package cleanup or some mirroring improvements.
I dream of a day when this stuff can all be in the main Arch website for everyone to see. Any interest in learning Django, Pierre? :) If anyone out there (yes, you users) knows Django and is looking to contribute to Arch or just hone your Django skills, I would be glad to chat with you about implementing a lot of these type of things in the main site. I've done quite a bit of work lately but I am only one person. Architecture differences is now implemented in the main site, and I am working to get a mirror status report in there as well. If we started thinking about this as well, that would really add some value to the site. I like the updates you've made. I wonder if it is worth trying to do something a little more sophisticated than just IP address for determining if someone is a repeat customer. The two cases are "my IP is dynamic and changes if I sneeze (thank you crappy DSL connection)" and "we have 200 machines behind one IP address". -Dan