[arch-dev-public] Large packages in repositories

Dan McGee dpmcgee at gmail.com
Tue Aug 17 10:39:14 EDT 2010


On Tue, Aug 17, 2010 at 9:28 AM, Thomas Bächler <thomas at archlinux.org> wrote:
> Am 17.08.2010 16:12, schrieb Dan McGee:
>> Hey guys,
>>
>> A package went in so big today that it made reporead blow up on my
>> local database due to the installed size being > 2GB:
>> http://www.archlinux.org/packages/community/i686/sage-mathematics/
>> http://www.archlinux.org/packages/community/x86_64/sage-mathematics/
>>
>> File "/usr/lib/python2.6/site-packages/django/db/backends/postgresql_psycopg2/base.py",
>> line 44, in execute
>>     return self.cursor.execute(query, args)
>> django.db.utils.DatabaseError: integer out of range
>
> You should be able to fix that, right?

Yes, and it doesn't blow up on MySQL so not a huge rush. I was just
showing the error for the curious.

>> I'm wondering if we need to be more careful when it comes to these big
>> packages entering our repositories. This one is especially suspect as
>> of its 71096 files (and 71094 in the other architecture), a ton of
>> them are things like *.py, *.pyc, *.html, or *.png files. This is ripe
>> for splitting into a -data package (or not including some of this
>> junk, if it is that, at all).
>
> We partly discussed this on aur-general, and making sage smaller is a
> bit of a long-term task, if even possible. Anyway, dbscripts lack
> support for having split packages with one package being
> architecture-independent, so even splitting these data files away won't
> be easy.

That's news to me:

archweb=# select p.id, pkgname, a.name, r.name, installed_size,
compressed_size from packages p left join arches a on a.id = p.arch_id
left join repos r on r.id = p.repo_id where pkgname like 'nexuiz%' or
pkgname like 'vdrift%' or pkgname like 'sauerbraten%' or pkgname like
'openarena%' or pkgname like 'flightgear%' order by compressed_size
desc;
  id   |     pkgname      |  name  |   name    | installed_size |
compressed_size
-------+------------------+--------+-----------+----------------+-----------------
  9562 | nexuiz-data      | any    | Community |      891768832 |
 882807981
 16811 | vdrift-data      | any    | Community |      593498112 |
 523473616
 15925 | sauerbraten-data | any    | Community |      538304512 |
 443680072
 16882 | openarena-data   | any    | Community |      345866240 |
 333849916
 13406 | flightgear-data  | any    | Community |      572440576 |
 317831046
  5490 | nexuiz           | x86_64 | Community |        6164480 |
   2757201
  4879 | flightgear       | x86_64 | Community |       10555392 |
   2579360
   711 | flightgear       | i686   | Community |       10272768 |
   2548000
  1260 | nexuiz           | i686   | Community |        5562368 |
   2314704
  6033 | sauerbraten      | x86_64 | Community |        3518464 |
   1142808
  1822 | sauerbraten      | i686   | Community |        3420160 |
   1017648
 16830 | vdrift           | x86_64 | Community |        2994176 |
    820512
 16812 | vdrift           | i686   | Community |        2899968 |
    787368
 16900 | openarena        | x86_64 | Community |        1937408 |
    601684
 16886 | openarena        | i686   | Community |        1593344 |
    493280
  4880 | flightgear-atlas | x86_64 | Community |         901120 |
    354938
   712 | flightgear-atlas | i686   | Community |         811008 |
    329020
(17 rows)

>> tl;dr: I think we need some standards with these huge packages, and
>> people need to be a lot more cognizant as to how big they are. We have
>> lost more than one mirror due to complaints over needed space and
>> stuff like this doesn't help.
>
> If a mirror cannot cope with a few GB, then it should be dropped anyway.
> Our repos will get bigger, one way or the other.

It isn't "a few GB"- it is one package taking up 1.5 GB between the
two architectures. That is to me, a bit out of control considering we
used to not even ship info pages to save package size.

I'm not "OMG take it out of the repos", but we need to at least not
let 10 more of these in without some serious thought as to what we
intend to package and distribute.

-Dan


More information about the arch-dev-public mailing list