poking around, i ran into this, in case it is of interest
----
% zcat packages.gz | grep -v '#' | gzip | wc
1365 8020 351564
% zcat packages.gz | grep -v '#' | sort | gzip | wc
1022 6173 284473
----
(only the last column -- bytes -- of the wc(1) output is meaningful.)
cheers.