On Sat, 8 Dec 2007 21:36:54 -0500 Nathan Jones <nathanj@insightbb.com> wrote:
A few suggestions:
1. Make sure to test with packages that contain hyphens, like 'gcc-libs'. Your regular expression does not work well with those packages. Ok; I try to use ' - ' instead of '-', but I opted to use '@' as separator. I hope that this will be useful when I do search with C code.
2. Store the actual byte offsets in the index file rather than (or in addition to) the line numbers. It is easier to seek to a position than a line number; see the man page for fseek. Right. Now it stores also byte offset.
3. You call writeIndexEntry() n times (n = # of pkgs), and each call reads in the entire huge database file. Change it so that it is only read once. Once you do this, you should find that the tot_lines being passed to the script is unnecessary. Pseudocode:
[cut] Done. Patch attached.. but it is based on previously patch. Do you like a patch based directly on master? If yes tell me, I rebuild :D
I am interested in seeing what the performance differences would be between this, the current backend, and a tar backend (FS#8586), so keep it up and good luck.
Thanks! -- JJDaNiMoTh - ArchLinux Trusted User