[arch-dev-public] Problem with web dashboard: massive orphaning of packages

Fri Sep 12 16:57:54 EDT 2008

On Fri, Sep 12, 2008 at 3:23 PM, Dusty Phillips <buchuki at gmail.com> wrote:
> 2008/9/12 Eric Belanger <belanger at astro.umontreal.ca>:
>> Hi,
>>
>> I don't know if you remember but a while ago a huge part of extra i686 (IIRC
>> it was all packages from L to Z) were orphaned and erroneouly showing up as
>> recently updated on the web site.  This just happened again with packages in
>> extra x86_64. I don't know what could caused that but it's very annoying as
>> we has to readopt all our packages back.
>
> Fuck.
>
> I remember Judd telling me not to swear at users but its ok to swear
> at scripts right?
>
> This has to be happening in reporead.py.  Fucking reporead.py. To the
> best of my knowledge, no other script updates the web database in
> anyway, am I wrong?
>
>
> The actual db_update script splits the packages into those that are in
> the database and those that are not and processes them separately.
> Packages that are not currently in the database get added as orphans
> because apparently its hard to interrogate the maintainer from the
> db.tar.gz. At first, I assumed that it is doing an add when it should
> be doing an update, which would add new packages with orphan
> maintainer. But this doesn't appear to be the case because there are
> not currently any duplicate x86_64 packages (that aren't in testing).
>
> My second more likely hypothesis is race conditions. I don't know how
> the db scripts update exactly, but I suspect reporead is reading a
> db.tar.gz file that is either broken or not yet fully uploaded. It
> sees this broken db file and drops all the packages in the web
> interface that are not in that file. Then x minutes later (crontab),
> it runs again on a proper db and sees the missing packages again. It
> adds them to the database and sets the maintainer to orphan.
>
> Are such broken dbs possible/likely/happening? If its a race
> condition, we need to put a lock on the database (maybe dbtools does
> this already) so that reporead isn't accessing it at the same time as
> dbtools. If its just that when the database gets updated it sometimes
> breaks the database well.. that just needs to be fixed.

This would be a hell of a race condition- to make a database, we first
unzip it to a temp location, make our changes and updates, and then
rezip it. Thus reporead.py would have to open the db while it is being
zipped, which is a very short period of time, but I guess
theoretically possible.

WIthout looking at the repo-add code, I don't know if we do this now,
but we probably should:
1. unzip the db to a temp location
2. make changes
3. rezip it to db.tar.gz.new
4. move old db to db.tar.gz.old
5 move new db to db.tar.gz

This would make the "db replacement" portion atomic in the sense that
we would never have a partial DB; we would only have a short period of
time where no db existed in that location. If really necessary we
could avoid even this by copying the old db to one with the old
extension instead of moving it.

-Dan