[arch-dev-public] Large packages in repositories
Dan McGee
dpmcgee at gmail.com
Tue Aug 17 10:39:14 EDT 2010
On Tue, Aug 17, 2010 at 9:28 AM, Thomas Bächler <thomas at archlinux.org> wrote:
> Am 17.08.2010 16:12, schrieb Dan McGee:
>> Hey guys,
>>
>> A package went in so big today that it made reporead blow up on my
>> local database due to the installed size being > 2GB:
>> http://www.archlinux.org/packages/community/i686/sage-mathematics/
>> http://www.archlinux.org/packages/community/x86_64/sage-mathematics/
>>
>> File "/usr/lib/python2.6/site-packages/django/db/backends/postgresql_psycopg2/base.py",
>> line 44, in execute
>> return self.cursor.execute(query, args)
>> django.db.utils.DatabaseError: integer out of range
>
> You should be able to fix that, right?
Yes, and it doesn't blow up on MySQL so not a huge rush. I was just
showing the error for the curious.
>> I'm wondering if we need to be more careful when it comes to these big
>> packages entering our repositories. This one is especially suspect as
>> of its 71096 files (and 71094 in the other architecture), a ton of
>> them are things like *.py, *.pyc, *.html, or *.png files. This is ripe
>> for splitting into a -data package (or not including some of this
>> junk, if it is that, at all).
>
> We partly discussed this on aur-general, and making sage smaller is a
> bit of a long-term task, if even possible. Anyway, dbscripts lack
> support for having split packages with one package being
> architecture-independent, so even splitting these data files away won't
> be easy.
That's news to me:
archweb=# select p.id, pkgname, a.name, r.name, installed_size,
compressed_size from packages p left join arches a on a.id = p.arch_id
left join repos r on r.id = p.repo_id where pkgname like 'nexuiz%' or
pkgname like 'vdrift%' or pkgname like 'sauerbraten%' or pkgname like
'openarena%' or pkgname like 'flightgear%' order by compressed_size
desc;
id | pkgname | name | name | installed_size |
compressed_size
-------+------------------+--------+-----------+----------------+-----------------
9562 | nexuiz-data | any | Community | 891768832 |
882807981
16811 | vdrift-data | any | Community | 593498112 |
523473616
15925 | sauerbraten-data | any | Community | 538304512 |
443680072
16882 | openarena-data | any | Community | 345866240 |
333849916
13406 | flightgear-data | any | Community | 572440576 |
317831046
5490 | nexuiz | x86_64 | Community | 6164480 |
2757201
4879 | flightgear | x86_64 | Community | 10555392 |
2579360
711 | flightgear | i686 | Community | 10272768 |
2548000
1260 | nexuiz | i686 | Community | 5562368 |
2314704
6033 | sauerbraten | x86_64 | Community | 3518464 |
1142808
1822 | sauerbraten | i686 | Community | 3420160 |
1017648
16830 | vdrift | x86_64 | Community | 2994176 |
820512
16812 | vdrift | i686 | Community | 2899968 |
787368
16900 | openarena | x86_64 | Community | 1937408 |
601684
16886 | openarena | i686 | Community | 1593344 |
493280
4880 | flightgear-atlas | x86_64 | Community | 901120 |
354938
712 | flightgear-atlas | i686 | Community | 811008 |
329020
(17 rows)
>> tl;dr: I think we need some standards with these huge packages, and
>> people need to be a lot more cognizant as to how big they are. We have
>> lost more than one mirror due to complaints over needed space and
>> stuff like this doesn't help.
>
> If a mirror cannot cope with a few GB, then it should be dropped anyway.
> Our repos will get bigger, one way or the other.
It isn't "a few GB"- it is one package taking up 1.5 GB between the
two architectures. That is to me, a bit out of control considering we
used to not even ship info pages to save package size.
I'm not "OMG take it out of the repos", but we need to at least not
let 10 more of these in without some serious thought as to what we
intend to package and distribute.
-Dan
More information about the arch-dev-public
mailing list