[arch-dev-public] Uploading old packages on archive.org (Was: Archive cleanup)

Baptiste Jonglez baptiste at bitsofnetworks.org
Fri Jun 1 17:24:47 UTC 2018


Here is the progress on the upload of old packages to archive.org.
I uploaded a few packages to test if my script works:


There is one identifier for each package, and then all versions of the
package + their signatures are contained under this identifier (see "Show
all" on the right).

Now, to finish this, I have a few questions:

- does the devops team have a place to store passwords?  I would like to
  create an "Arch Linux" account, so that I'm not the only one to have
  access.  I also need an email address for the account, maybe something
  like internetarchive at archlinux.org or just the devops mailing list address?

- is that OK to upload ~2 TB from orion?  Is the server on an limited data

- I'm still waiting on a final confirmation from archive.org, whether they
  are OK with this amount of data.

The upload process itself is quite slow latency-wise, it takes 5-10
seconds to upload a file whatever its size.  For packages from 2013 to
2015 there's 250k files to upload, I estimate it will take a few days if I
run 32 upload threads in parallel.

By the way, we could even keep the year/month/day symlink hierarchy on
orion for old packages, and redirect downloads to archive.org.  There is
just a small issue with packages that have "+", "@" or "." in their name,
because that's not allowed as identifiers in archive.org (see the second
and third examples above, where my script replaced the "@" with "_")

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.archlinux.org/pipermail/arch-dev-public/attachments/20180601/c8201210/attachment.asc>

More information about the arch-dev-public mailing list