[pacman-dev] proof of concept code with bsd db4
dpmcgee at gmail.com
Sun Nov 8 22:50:25 EST 2009
On Sat, Oct 31, 2009 at 11:37 AM, solsTiCe d'Hiver
<solstice.dhiver at gmail.com> wrote:
> i wanted to make a new bsd db4 back-end for alpm. but i never reached my
> goal. and will not
> all i have is a proof of concept code that use bsd db4 api to store
> pmpkg_t and wanted to share it with anyone (interested ?)
> i have coded 3 utilities:
> - one that converts pacman's db into a bsd db4 file for each repo
> - one that reads that new db format to perform query as pacman does
> - one that converts directly a tarball db (taken from a sync mirror)
> into a bsd db4 file
> if this proves useful for someone, great.
> More info at http://pagesperso-orange.fr/solstice.dhiver/alpmdb4.html
> and in the README of
Nice work on actually doing something here and sharing the code!
Thanks, as it might just make some wheels turn for some other people
here on the list.
I grabbed your code and took it for a spin. I liked the fact that you
had a README and all, I didn't have much trouble at all getting it
running. I even found a real hotspot in readdb (add_sorted is a killer
in a tight loop; it makes a lot more sense to do all your adds
followed by an alpm_list_msort()).
For others on the list who haven't looked at it yet:
* Raw speed alone, this wins. Of course, pacman does a lot more (this
isn't parsing conf files, reading mirrorlists, etc) but a "-Ss pacman"
search yielded times of 0.083 seconds vs 0.282 seconds (in the hot
cache case, of course).
* BDB uses key/value pairs for those who aren't familiar. The database
layout could probably be simplified a bit- we could pack many
attributes into one key/value pair for those we don't use all that
often, or never search by but only do lookups.
* It didn't take all that much code to do this. That is encouraging.
What do people think about non-file-system-based backends? There are
several options we could think about:
* BSD DB4, similar to what was done here (fast and pretty simple)
* SQLite, which might give us a bit more flexibility for querying/lookup
* Direct tarfile parsing each time, no conversion needed but likely
The biggest reason always raised in the past against non-file backends
was corruption. If you get a corrupted localdb or something you can't
recover from, you are in a bad place. With files, you have the lowest
barrier to recovery. With a more binary format, it is a lot trickier.
More information about the pacman-dev