On 10/26/06, Roman Kyrylych <roman.kyrylych@gmail.com> wrote:
You don't need to do complex indexing. Indexing by package name is enough. By terms "small, fast, embedable" I mean that it's better than MySQL or something else. And it's better than dbm-style databases when it comes to random writing and transactions.
What I'm saying is that you have the overhead of a full-scale database with little gain. Indexing by package names, yes that's great and all, but that doesn't help with the slowest-of-the-slow -Ss operation. -Ss searches package names AND descriptions, allowing for regex matching. Sqlite (and most DBs) do not support regex matching. Indexing the search by package name means little, because a -Ss "foo" still SHOULD match a package named "barfoo" and a package with "foo" in the description. This means a sequential search. Not only that, but because the DB will not support regex, that means that one must iterate over each and every entry, get the values at the C level, apply a regex pattern, and note if it is a match or not. The only speed you gain would be in the initial opening of the files. To me, this does not mean a DB backend is the solution. It may be better than the files backend, yes, but not the best, and outperforming the files backend is not hard... I was able to improve performance approximately 6 times by simply using the db.tar.gz files in place of disparate text files.