aaronmgriffin at gmail.com
Thu Oct 26 16:41:58 EDT 2006
On 10/26/06, Roman Kyrylych <roman.kyrylych at gmail.com> wrote:
> You don't need to do complex indexing. Indexing by package name is enough.
> By terms "small, fast, embedable" I mean that it's better than MySQL
> or something else. And it's better than dbm-style databases when it
> comes to random writing and transactions.
What I'm saying is that you have the overhead of a full-scale database
with little gain. Indexing by package names, yes that's great and
all, but that doesn't help with the slowest-of-the-slow -Ss operation.
-Ss searches package names AND descriptions, allowing for regex
matching. Sqlite (and most DBs) do not support regex matching.
Indexing the search by package name means little, because a -Ss "foo"
still SHOULD match a package named "barfoo" and a package with "foo"
in the description. This means a sequential search. Not only that,
but because the DB will not support regex, that means that one must
iterate over each and every entry, get the values at the C level,
apply a regex pattern, and note if it is a match or not. The only
speed you gain would be in the initial opening of the files. To me,
this does not mean a DB backend is the solution. It may be better
than the files backend, yes, but not the best, and outperforming the
files backend is not hard... I was able to improve performance
approximately 6 times by simply using the db.tar.gz files in place of
disparate text files.
More information about the pacman-dev