[arch-devops] [arch-dev-public] Uploading old packages on archive.org (Was: Archive cleanup)

Baptiste Jonglez baptiste at bitsofnetworks.org
Sun Jan 27 21:37:23 UTC 2019


On 24-01-19, Florian Pritz wrote:
> > What about uploading to archive.org as soon as we archive packages on orion?
> > 
> >   https://github.com/archlinux/archivetools/blob/master/archive.sh
> 
> While we still use this archive.sh script, dbscripts has recently also
> be extended to populate the archive continuously. So uploading could be
> integrated there with a queue file and a background job that performs
> the upload.
> 
> Alternatively the uploader could be kept standalone and just adapted to
> run more often and to maintain its own database/list to know which
> packages have already been successfully uploaded and which haven't. I'll
> call this "state database". Then we could run it every hour or so via a
> systemd timer and it could upload all new and all failed packages. One
> thing I'd want to have in this context is that the uploader should exit
> with an error to let the systemd service fail if a package fails to
> upload multiple times. I think I'd actually prefer this to be standalone
> for simplicity.

There is one argument against a standalone tool: each time it runs, it
will need to scan the whole filesystem hierarchy to detect new packages,
which can be quite slow.

One solution is to have dbscripts build a queue of new packages to upload,
but then the upload tool would not be completely standalone (it's
basically your first solution above).

A simpler but less robust way would be to scan only the current year
(along with the previous year for a while).

Other than this issue, it indeed looks like a good idea to clearly
separate this tool from the dbscripts.

> > In any case, we need a retry mechanism to cope with the case where the
> > upload fails.
> 
> This could use the state database I mentioned above. As for the
> implementation of such a database, I'd suggest sqlite instead of rolling
> your own text based list or whatever.  It's fast and simple, but you get
> all the fancy stuff, like transactions, for free. You also don't have to
> deal with recovering the database if the script crashes. sqlite just
> rolls back uncommited transactions for you.
> 
> Would you be interested in adapting the uploader like this and making it
> an automated service? If you're interested I can help with the
> deployment part and provide feedback on the scripting side. If you want,
> we can also discuss this on IRC.

I don't have a lot of time to work on this at the moment, but I'll see
what I can do.

How urgent is the cleanup on orion?  Is it ok to do it in a few weeks/months?

> PS: I've whitelisted you on the arch-devops ML so that your replies also
> get archived.

Ok, thanks!

Baptiste
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.archlinux.org/pipermail/arch-devops/attachments/20190127/328c7ed0/attachment.asc>


More information about the arch-devops mailing list