[arch-dev-public] [objectives] Package triage on current + extra (archstats)

Dan McGee dpmcgee at gmail.com
Mon May 7 12:13:16 EDT 2007


On 5/6/07, Simo Leone <simo at archlinux.org> wrote:
> On Sun, May 06, 2007 at 08:28:53PM -0400, Dan McGee wrote:
> > I know we weren't supposed to rush into the objectives here before
> > settling on the goals, but I think this is an important one to
> > examine, especially with a change in the repository structure
> > happening in the coming months.
> >
> > How many of the devs are using archstats, let alone users? After
> > looking at it a bit more today, I realized this could be a great asset
> > to finding areas where devs should be spending their time with regard
> > to package maintenance. It would also help us out greatly when it
> > comes to determining which packages should no longer be maintained by
> > us and dropped back to the TU level and/or unsupported.
> >
> > However, a few things need work:
> > 1. The current website
> > <http://www.archlinux.org/~simo/archstats/index.php> is in dire need
> > of an overhaul. Using the old website theme is the least of our
> > worries- things like the package listing are at this point rather
> > unusable and only suck 223 MB of memory in Firefox once fully loaded.
> > I have a lot of ideas for this here- enable breakdown by pkgname only
> > (so it actually looks like people are running kernel26), limit number
> > of results on the page unless someone actually selects to see them
> > all, etc.
> > 2. The archstats database. It contains several one-time system
> > updates, and several systems that haven't updated since 2005 or
> > earlier. This is clearly junk data, and to make archstats useful we
> > should probably just start fresh, and find a way to cut down on
> > spurious commits, which leads into...
> > 3. The archstats program itself. In the last week, I've had no
> > problems with it, but have had problems before (some of the spurious
> > commits above were definitely my fault). Configuration should probably
> > be editable in a conf file (the current /etc file has a big fat
> > warning saying do not edit by hand- this seems not the Arch Way).
> > Setting it up as a cron job is straightforward, but if we want people
> > to use it we should probably think of a way to make it even easier.
> >
> > Comments on any of this? I know its yet another project idea to be
> > thrown out there, but this one could prove very helpful and in the
> > long run vastly reduce dev maintenance of package when we realize very
> > few users are actually using a package and it should be maintained
> > elsewhere.
> >
>
> Actually I have an entirely rewritten archstats laying on my harddrive,
> all it needs is someone to slap a pretty web interface on it. It's
> probably not very well done, it was me playing with django a bit, but
> nonetheless it didn't take very long to put together.
>
> As for ideas, what I've been wanting to do is have a way for pacman to
> have some form of "hooks". Something where a command gets run after an
> Syu or an S or an R, in this case, that command would be whatever is
> required for archstats to update the package list. This would make it
> really easy for people to just "set and forget" archstats, and we could
> get very good and up to date stats that way.
>
> Also, I sort of inherited the archstats project from eric when he left,
> and I haven't really touched it at all, besides culling some old data
> once in a while (havent done that in a few months though).
>
> It's another one of those projects I said I'd work on but haven't gotten
> around to... I won't bother making promises I might not keep but school
> does end soon, I've got nothing but free time in a few weeks...
> hopefully I can get my butt in gear and at least get the ball rolling or
> something.

Simo already knows this, but here is what I started last night while
procrastinating my exam studying:
<http://code.toofishes.net/gitweb.cgi?p=archstats.git;a=summary>

I've managed to remove about 500 lines of code by moving repetition to
functions (still more to be done, but that was 1/6 of the code). I
also completely bypassed the MD5sum checking stuff, showing how that
is worthless. Simo and I were trying to think of a better way to do
client verification (Jürgen, any ideas?), and we came up with nothing.

Obviously we could get poisoned data using archstats, but at the same
time, some data seems better than no data especially if we can patrol
it.

-Dan


More information about the arch-dev-public mailing list