Re: [arch-general] [arch-dev-public] Large packages in repositories
On 17.08.2010 16:28, Thomas Bächler wrote:
Am 17.08.2010 16:12, schrieb Dan McGee:
tl;dr: I think we need some standards with these huge packages, and people need to be a lot more cognizant as to how big they are. We have lost more than one mirror due to complaints over needed space and stuff like this doesn't help. If a mirror cannot cope with a few GB, then it should be dropped anyway. Our repos will get bigger, one way or the other.
I share this opinion. The Arch repos are hardly large and even if we added 50 GB to them they would still wouldn't be that large. I know comparisons with other distributions are probably not a good idea on this list but it does help to get a general understanding of where we stand and what "large" really means. Debian - 428GB (http://www.debian.org/mirror/size) Fedora - 653GB (http://download.fedora.redhat.com/pub/DIRECTORY_SIZES.txt) Ubuntu - 421GB (https://wiki.ubuntu.com/Mirrors) openSUSE - 500GB+ (http://en.opensuse.org/openSUSE:Mirror_infrastructure) Thankfully, we don't keep around old releases of packages or isos. Our mirrors will never have to cope with the amount of data that other distributions make them cope with. I think since they are already mirroring 2TB+ worth of data from other distros, they can easily squeeze in 50GB of Arch, or more. I'd really like to resolve this problem in the course of this discussion since it has been brought up every time big packages are pending (mostly games). Personally, I think we don't need a policy or anything on this. Something very odd would have to happen for our repos to grow too much for our mirrors to handle. This is Arch, let's keep it simple and unbureaucratic. I'm not saying "let's throw all that big shit into there" but if there are potential packages that would improve the user experience, their size should not be the determining factor to their inclusion. -- Sven-Hendrik
On Wed, 18 Aug 2010 09:19:09 +0200, Sven-Hendrik Haase <sh@lutzhaase.com> wrote:
On 17.08.2010 16:28, Thomas Bächler wrote:
Am 17.08.2010 16:12, schrieb Dan McGee:
tl;dr: I think we need some standards with these huge packages, and people need to be a lot more cognizant as to how big they are. We have lost more than one mirror due to complaints over needed space and stuff like this doesn't help. If a mirror cannot cope with a few GB, then it should be dropped anyway. Our repos will get bigger, one way or the other.
I share this opinion. The Arch repos are hardly large and even if we added 50 GB to them they would still wouldn't be that large. I know comparisons with other distributions are probably not a good idea on this list but it does help to get a general understanding of where we stand and what "large" really means.
Debian - 428GB (http://www.debian.org/mirror/size) Fedora - 653GB (http://download.fedora.redhat.com/pub/DIRECTORY_SIZES.txt) Ubuntu - 421GB (https://wiki.ubuntu.com/Mirrors) openSUSE - 500GB+ (http://en.opensuse.org/openSUSE:Mirror_infrastructure)
Thankfully, we don't keep around old releases of packages or isos. Our mirrors will never have to cope with the amount of data that other distributions make them cope with. I think since they are already mirroring 2TB+ worth of data from other distros, they can easily squeeze in 50GB of Arch, or more.
I'd really like to resolve this problem in the course of this discussion since it has been brought up every time big packages are pending (mostly games). Personally, I think we don't need a policy or anything on this. Something very odd would have to happen for our repos to grow too much for our mirrors to handle. This is Arch, let's keep it simple and unbureaucratic.
I'm not saying "let's throw all that big shit into there" but if there are potential packages that would improve the user experience, their size should not be the determining factor to their inclusion.
-- Sven-Hendrik
You need to keep in mind that's its not just the disk space that might cause problems here but traffic and especially bandwidth are. E.g. the our mainserver has about 10mbit/d bandwidth including mirroring, website etc.. -- Pierre Schmitz, https://users.archlinux.de/~pierre
On 18 August 2010 16:37, Pierre Schmitz <pierre@archlinux.de> wrote:
On Wed, 18 Aug 2010 09:19:09 +0200, Sven-Hendrik Haase <sh@lutzhaase.com> wrote:
On 17.08.2010 16:28, Thomas Bächler wrote:
Am 17.08.2010 16:12, schrieb Dan McGee:
tl;dr: I think we need some standards with these huge packages, and people need to be a lot more cognizant as to how big they are. We have lost more than one mirror due to complaints over needed space and stuff like this doesn't help. If a mirror cannot cope with a few GB, then it should be dropped anyway. Our repos will get bigger, one way or the other.
I share this opinion. The Arch repos are hardly large and even if we added 50 GB to them they would still wouldn't be that large. I know comparisons with other distributions are probably not a good idea on this list but it does help to get a general understanding of where we stand and what "large" really means.
Debian - 428GB (http://www.debian.org/mirror/size) Fedora - 653GB (http://download.fedora.redhat.com/pub/DIRECTORY_SIZES.txt) Ubuntu - 421GB (https://wiki.ubuntu.com/Mirrors) openSUSE - 500GB+ (http://en.opensuse.org/openSUSE:Mirror_infrastructure)
Thankfully, we don't keep around old releases of packages or isos. Our mirrors will never have to cope with the amount of data that other distributions make them cope with. I think since they are already mirroring 2TB+ worth of data from other distros, they can easily squeeze in 50GB of Arch, or more.
I'd really like to resolve this problem in the course of this discussion since it has been brought up every time big packages are pending (mostly games). Personally, I think we don't need a policy or anything on this. Something very odd would have to happen for our repos to grow too much for our mirrors to handle. This is Arch, let's keep it simple and unbureaucratic.
I'm not saying "let's throw all that big shit into there" but if there are potential packages that would improve the user experience, their size should not be the determining factor to their inclusion.
-- Sven-Hendrik
You need to keep in mind that's its not just the disk space that might cause problems here but traffic and especially bandwidth are. E.g. the our mainserver has about 10mbit/d bandwidth including mirroring, website etc..
And it all comes down to paying the bills to better cope with the load. So the final authority on this can only be those who handle the financial matters. Other than that, size is not an issue. Big distros have good funding, so size is really not an issue. -- GPG/PGP ID: B42DDCAD
On 18 August 2010 16:37, Pierre Schmitz<pierre@archlinux.de> wrote: [...]
-- Sven-Hendrik
You need to keep in mind that's its not just the disk space that might cause problems here but traffic and especially bandwidth are. E.g. the our mainserver has about 10mbit/d bandwidth including mirroring, website etc..
One important issue which hasn't been raised is the fact that the bigger the package the longer it takes to sync than 1 package. Today's sync for me took 53 minutes the previous highest time i can remember is about 20 minutes and that was with the recent push of KDE 4.5 beta? into testing. I'm lucky in that I don't suffer any issues yet, but I can imagine that for other with smaller mirrors with lower bandwidth during this period everything else on the server is slowed because of this 1 instance of rsync. Not only that, but some of us have contracts to abide by which restrict the use of long-running processes beyond reasonable use so while multiple syncs can be done as a work-around it'd be nice to at least discuss some sort of policy here, even if it's just that a notification should be done on arch-mirrors when exceptionally large packages are about to go into the repos. I'm aware this is a one-off but a few years ago I'd be saying the same thing large games being a one-off.
participants (4)
-
Nathan Wayde
-
Pierre Schmitz
-
Ray Rashif
-
Sven-Hendrik Haase