Hey. I got interested in this last week, and started breaking libalpm apart and try to fit in an sqliteish implementation. The code was new to me and I didn't have any other consideration but to get something working as fast as possible, so the result is nasty. Basically, I first commented treename from struct __pmdb_t so the compiler would tell me all(or most of) the places where the old db is used, and either disabled those functions or did the same with sqlite. Mainly the additions are in be_sqlite.c (renamed be_files.c), where _alpm_db_open opens the sqlite db, and db.c: _alpm_db_search, which executes a simple SELECT * FROM packages WHERE name LIKE "%foo%" and populates the return list.
So, I implemented about 40% of pacman -Ss. If someone cares about timings (and you probably shouldn't, since my version doesn't do quite the same thing), here they are:
Huh, this "sqlite backend idea" is quite popular nowadays, imho more people working on its implementation, so I suggest co-operation ;-)
(running pacman -Ss g three times after a reboot)
pacman-3.1: 41.866s, 0.765s, 0.762s mutilated-pacman-with-sqlite: 1.036s, 0.131s, 0.133s
pacman-3.1 shows probably rather worse performance in the worst run than it usually would, since my /var/ was 99% full at the time :)
Anyway. The timing is not the most important issue, I think. libalpm has a lot of code that is merely there because C sucks for things like string and directory manipulation. And we need to do a lot of that. My humble guess is that a proper implementation of libalpm done with sqlite could be at least 50% smaller with a more understandable codebase.
If we want to do this, then how? Some options from the top of my head:
1) for the parts that deal with the db, start from scratch. With the talent you guys have, shouldn't be a problem? Libalpm isn't very large...
2) for the development phase, consider sqlite to be a cache for the filedb, and gradually move each piece of code to the other side. This way, the legacy code would weigh us down a bit, but the change might be more sustainable.
3) Just hack in the functionality somehow, anyhow.
4) Refactor alpm to support different backends and implement whatever backend de jour.
Ideas, praise, flames welcome. Code available by request.
First of all, I appreciate your work/attempt to make pacman better. Well, I'm pretty sure, that 1. sqlite is faster 2. reduces codebase (find replacements, check for provisions, groups etc. can be reduced to a simple sql query), but we haven't got reassuring answers to our "database corruption" fear. So I would like to ask you to convince us, why sqlite is safe (well, personally I have very limited sql[ite] knowledge now). Please try to understand that for most of the pacman devels/contributors "stability" is more important than speed [obviously corrupted localdb == unusable system]. That's why I can guess that this idea won't be accepted until we cannot see the proof of the fact that the new db back-end is as safe as the old one (or more safer ;-P). Bye