On Wed, Mar 30, 2011 at 7:19 PM, Dan McGee <dpmcgee@gmail.com> wrote:
On Wed, Mar 30, 2011 at 9:13 PM, elij <elij.mx@gmail.com> wrote:
It might be worth while to add an index on ModifiedTS as well. That would make sorting by ModifiedTS really fast for generating the rss and recent packages lists (as well as sorting aur results by date).
Granted, adding another index would make insertion or updates a tad slower, but I am not sure *how much* slower.
Dan, do you have any thoughts on adding an index for ModifiedTS? It might be a nice trade off with the removal of the DummyPkg stuff (and related dummypkg index).
That conversation we had on IRC the other day was in direct relation to this- for the package list, we don't use the indexes at all because our query isn't exactly how MySQL likes it. The only way I could convince it to use a index was remove most of the joins and drop the ORDER BY condition down to one column.
The recent packages list is cached too, so although it would be used there we don't run that query more than once every 5 minutes anyway.
I was writing a part of the spew backend that was setting the last modified http header to handle if-modified-since behavior, for generating the rss/atom feed in realtime instead of periodic. I used MAX on the ModifiedTS field to get the latest updated package's ModifiedTS, and used that as the http last modified value. The query planner was showing full table scans to find said max value (most recent update for any package). Adding an index on ModifiedTS of course made the query planner then show that it was using an index. However, the table isn't _that_ large, and a full table scan may even fit inside the query cache...so it may or may not be worthwhile. I had forgotten that the AUR caches the package feed for 5 minutes too. So yeah, adding that index probably wouldn't make much sense. :)