[pacman-dev] Pacman database size study

Allan McRae allan at archlinux.org
Wed Jan 22 22:02:35 UTC 2020


On 23/1/20 2:03 am, Anatol Pomozov wrote:
> Hello
> 
> On Wed, Jan 22, 2020 at 2:23 AM Allan McRae <allan at archlinux.org> wrote:
>>
>> On 22/1/20 6:54 pm, Anatol Pomozov wrote:
>>> The first experiment is to parse db tarfile using the script and then
>>> write it back to a file:
>>>   uncompressed size is 17757184 that is equal to original sample
>>>   'zstd -19' compressed size is 4366994 that is 1.0084540990896713
>>> times better than original sample
>>>
>>> Tar *entries* content is identical to the original file. Uncompressed
>>> size is exactly the same. Compressed (zstd -19) size is 0.8% better.
>>> It comes from the fact that my script does not set entries user/group
>>> value and neither sets tar entries modification time. I am not sure if
>>> this information is actually used by pacman. Modification time
>>> contains a lot of entropy that compressor does not like.
>>
>> tl;dr
>>
>> "original"      4366994
>> no md5          4188019
>> no pgp          1160912
>> np md5+pgp      1021667
>>
>>
>> But do any of these numbers stand if you keep the tar file?
> 
> I do not fully understand your question here. plainXXX+uncomressed is
> a TAR file that matches current db format.
> 

Oops...  Did not look down far enough your supplied files.  I downloaded
db.original from your link, which is not original, and thought your
numbers were based off that.

A


More information about the pacman-dev mailing list