[arch-dev-public] Large packages in repositories
dpmcgee at gmail.com
Tue Aug 17 10:39:14 EDT 2010
On Tue, Aug 17, 2010 at 9:28 AM, Thomas Bächler <thomas at archlinux.org> wrote:
> Am 17.08.2010 16:12, schrieb Dan McGee:
>> Hey guys,
>> A package went in so big today that it made reporead blow up on my
>> local database due to the installed size being > 2GB:
>> File "/usr/lib/python2.6/site-packages/django/db/backends/postgresql_psycopg2/base.py",
>> line 44, in execute
>> return self.cursor.execute(query, args)
>> django.db.utils.DatabaseError: integer out of range
> You should be able to fix that, right?
Yes, and it doesn't blow up on MySQL so not a huge rush. I was just
showing the error for the curious.
>> I'm wondering if we need to be more careful when it comes to these big
>> packages entering our repositories. This one is especially suspect as
>> of its 71096 files (and 71094 in the other architecture), a ton of
>> them are things like *.py, *.pyc, *.html, or *.png files. This is ripe
>> for splitting into a -data package (or not including some of this
>> junk, if it is that, at all).
> We partly discussed this on aur-general, and making sage smaller is a
> bit of a long-term task, if even possible. Anyway, dbscripts lack
> support for having split packages with one package being
> architecture-independent, so even splitting these data files away won't
> be easy.
That's news to me:
archweb=# select p.id, pkgname, a.name, r.name, installed_size,
compressed_size from packages p left join arches a on a.id = p.arch_id
left join repos r on r.id = p.repo_id where pkgname like 'nexuiz%' or
pkgname like 'vdrift%' or pkgname like 'sauerbraten%' or pkgname like
'openarena%' or pkgname like 'flightgear%' order by compressed_size
id | pkgname | name | name | installed_size |
9562 | nexuiz-data | any | Community | 891768832 |
16811 | vdrift-data | any | Community | 593498112 |
15925 | sauerbraten-data | any | Community | 538304512 |
16882 | openarena-data | any | Community | 345866240 |
13406 | flightgear-data | any | Community | 572440576 |
5490 | nexuiz | x86_64 | Community | 6164480 |
4879 | flightgear | x86_64 | Community | 10555392 |
711 | flightgear | i686 | Community | 10272768 |
1260 | nexuiz | i686 | Community | 5562368 |
6033 | sauerbraten | x86_64 | Community | 3518464 |
1822 | sauerbraten | i686 | Community | 3420160 |
16830 | vdrift | x86_64 | Community | 2994176 |
16812 | vdrift | i686 | Community | 2899968 |
16900 | openarena | x86_64 | Community | 1937408 |
16886 | openarena | i686 | Community | 1593344 |
4880 | flightgear-atlas | x86_64 | Community | 901120 |
712 | flightgear-atlas | i686 | Community | 811008 |
>> tl;dr: I think we need some standards with these huge packages, and
>> people need to be a lot more cognizant as to how big they are. We have
>> lost more than one mirror due to complaints over needed space and
>> stuff like this doesn't help.
> If a mirror cannot cope with a few GB, then it should be dropped anyway.
> Our repos will get bigger, one way or the other.
It isn't "a few GB"- it is one package taking up 1.5 GB between the
two architectures. That is to me, a bit out of control considering we
used to not even ship info pages to save package size.
I'm not "OMG take it out of the repos", but we need to at least not
let 10 more of these in without some serious thought as to what we
intend to package and distribute.
More information about the arch-dev-public