[aur-dev] JSON - search options
I was wondering about searching with more than one argument. For example, if I want to find PKGBUILDs that name contain `pac` and `man` words, I have to search for all `pac` PKGBUILDs and then, grep that to show only those with `man`. But it would be easyier for AUR, to get more complex SQL question, and less data will be transfered. What do you think about it? -- pozdrawiam, Piotr Husiatyński
On 4/5/08, Piotr Husiatyński <phusiatynski@gmail.com> wrote:
I was wondering about searching with more than one argument. For example, if I want to find PKGBUILDs that name contain `pac` and `man` words, I have to search for all `pac` PKGBUILDs and then, grep that to show only those with `man`. But it would be easyier for AUR, to get more complex SQL question, and less data will be transfered.
What do you think about it?
My original implementation used mysql full text searching (type boolean if I recall). http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html I don't remember why it was changed to `like` style matching... maybe because it required adding a full text search index, or because the load difference was an unknown quantity with full text searches. I don't recall offhand. It would have allowed queries like.. +pac +man -aur I think some testing would need to be done to determine the 'expense' of full text indexing with regard to the aur. Note also, that I think the default min word length for full text queries is 4 characters. I believe this is a tunable, but I really wouldn't recommend going lower than minlength of 3 probably. More info here on fulltext searching with mysql: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
On 02:14 Sat 05 Apr , eliott wrote:
On 4/5/08, Piotr Husiatyński <phusiatynski@gmail.com> wrote:
I was wondering about searching with more than one argument. For example, if I want to find PKGBUILDs that name contain `pac` and `man` words, I have to search for all `pac` PKGBUILDs and then, grep that to show only those with `man`. But it would be easyier for AUR, to get more complex SQL question, and less data will be transfered.
What do you think about it?
My original implementation used mysql full text searching (type boolean if I recall). http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
I don't remember why it was changed to `like` style matching... maybe because it required adding a full text search index, or because the load difference was an unknown quantity with full text searches. I don't recall offhand.
It would have allowed queries like.. +pac +man -aur
This would be great and easy in use, but is it safe for server?
I think some testing would need to be done to determine the 'expense' of full text indexing with regard to the aur. Note also, that I think the default min word length for full text queries is 4 characters. I believe this is a tunable, but I really wouldn't recommend going lower than minlength of 3 probably. 3 is ok, becouse of lua, abs, git and many other names. But right now, I can do `aur search a` and the result is 8130 names. Or searching with short words should be allowed, but the result has to be cut to smaller amount of data. You can search giving the single char, but you'll get only 50 first names.
{ "type" : "search" , "results":ReturnData, "cut" : ("yes"|"no"), } -- pozdrawiam, Piotr Husiatyński
2008/4/5 Piotr Husiatyński <phusiatynski@gmail.com>:
On 02:14 Sat 05 Apr , eliott wrote:
On 4/5/08, Piotr Husiatyński <phusiatynski@gmail.com> wrote:
I was wondering about searching with more than one argument. For example, if I want to find PKGBUILDs that name contain `pac` and `man` words, I have to search for all `pac` PKGBUILDs and then, grep that to show only those with `man`. But it would be easyier for AUR, to get more complex SQL question, and less data will be transfered.
What do you think about it?
My original implementation used mysql full text searching (type boolean if I recall). http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
I don't remember why it was changed to `like` style matching... maybe because it required adding a full text search index, or because the load difference was an unknown quantity with full text searches. I don't recall offhand.
It would have allowed queries like.. +pac +man -aur
This would be great and easy in use, but is it safe for server?
That would indeed be very cool! There are so many packages in AUR that it's hard to find what you are looking for sometimes. A search with more then one term makes a big difference I think, I like the minus search too. I hope the change to 'like' style didn't have that good of a reason :)
I think some testing would need to be done to determine the 'expense' of full text indexing with regard to the aur. Note also, that I think the default min word length for full text queries is 4 characters. I believe this is a tunable, but I really wouldn't recommend going lower than minlength of 3 probably. 3 is ok, becouse of lua, abs, git and many other names. But right now, I can do `aur search a` and the result is 8130 names. Or searching with short words should be allowed, but the result has to be cut to smaller amount of data. You can search giving the single char, but you'll get only 50 first names.
{ "type" : "search" , "results":ReturnData, "cut" : ("yes"|"no"), }
-- pozdrawiam, Piotr Husiatyński
On Sat, 5 Apr 2008 12:33:15 +0200 Piotr Husiatyński <phusiatynski@gmail.com> wrote:
3 is ok, becouse of lua, abs, git and many other names. But right now, I can do `aur search a` and the result is 8130 names. Or searching with short words should be allowed, but the result has to be cut to smaller amount of data. You can search giving the single char, but you'll get only 50 first names.
{ "type" : "search" , "results":ReturnData, "cut" : ("yes"|"no"), }
This is a good idea, but instead of cut being boolean I would make it an integer. Zero (or less) would mean no cut and anything larger would mean cut at that number. I was also thinking search should return more information than just pkgname and pkgid. For example you could specify how much information you want returned from the search. Something like: ?type=search&arg=foo&fields=url,desc,tarball The fields would specify what information you want in addition to pkgname and pkgid. That would also help reduce hits on the server because right now if a client wants information on all the search results it would have to do a separate request for each package.
On 4/5/08, Loui <louipc.ist@gmail.com> wrote:
On Sat, 5 Apr 2008 12:33:15 +0200
Piotr Husiatyński <phusiatynski@gmail.com> wrote:
3 is ok, becouse of lua, abs, git and many other names. But right now, I can do `aur search a` and the result is 8130 names. Or searching with short words should be allowed, but the result has to be cut to smaller amount of data. You can search giving the single char, but you'll get only 50 first names.
{ "type" : "search" , "results":ReturnData, "cut" : ("yes"|"no"), }
This is a good idea, but instead of cut being boolean I would make it an integer. Zero (or less) would mean no cut and anything larger would mean cut at that number.
Seems like a bad idea to me.
participants (4)
-
Ben Dibbens
-
eliott
-
Loui
-
Piotr Husiatyński