[aur-dev] [PATCH 1/5] Submission process code refactor
We had a ton of duplicate code shared between the insert and update
cases. Do a refactor so we can pull this stuff out below the if/else
block and only need it there once, saving some headaches.
Signed-off-by: Dan McGee
Set it equal to the SubmittedTS field, which will be our indication the
package is new when we show the logo on the front page of the AUR.
This results in the ability to remove the use of the unindexable
GREATEST() function from the AUR code everywhere we had to use it before
to handle the 0 timestamp case.
Note that there is no race condition here in calling UNIX_TIMESTAMP()
twice- it always returns the time at the beginning of statment
execution:
mysql> select unix_timestamp(), sleep(2), unix_timestamp();
+------------------+----------+------------------+
| unix_timestamp() | sleep(2) | unix_timestamp() |
+------------------+----------+------------------+
| 1300851746 | 0 | 1300851746 |
+------------------+----------+------------------+
1 row in set (2.00 sec)
Signed-off-by: Dan McGee
It might be worth while to add an index on ModifiedTS as well.
That would make sorting by ModifiedTS really fast for generating the
rss and recent packages lists (as well as sorting aur results by
date).
Granted, adding another index would make insertion or updates a tad
slower, but I am not sure *how much* slower.
Dan, do you have any thoughts on adding an index for ModifiedTS?
It might be a nice trade off with the removal of the DummyPkg stuff
(and related dummypkg index).
On Wed, Mar 30, 2011 at 6:48 PM, Dan McGee
Set it equal to the SubmittedTS field, which will be our indication the package is new when we show the logo on the front page of the AUR.
This results in the ability to remove the use of the unindexable GREATEST() function from the AUR code everywhere we had to use it before to handle the 0 timestamp case.
Note that there is no race condition here in calling UNIX_TIMESTAMP() twice- it always returns the time at the beginning of statment execution:
mysql> select unix_timestamp(), sleep(2), unix_timestamp(); +------------------+----------+------------------+ | unix_timestamp() | sleep(2) | unix_timestamp() | +------------------+----------+------------------+ | 1300851746 | 0 | 1300851746 | +------------------+----------+------------------+ 1 row in set (2.00 sec)
Signed-off-by: Dan McGee
--- UPGRADING | 7 +++++++ web/html/pkgsubmit.php | 2 +- web/lib/pkgfuncs.inc | 2 +- web/lib/stats.inc | 4 ++-- web/template/stats/updates_table.php | 7 ++----- 5 files changed, 13 insertions(+), 9 deletions(-) diff --git a/UPGRADING b/UPGRADING index 5335d89..4743b8e 100644 --- a/UPGRADING +++ b/UPGRADING @@ -1,6 +1,13 @@ Upgrading =========
+From 1.8.1 to X.X.X +------------------- + +1. Update the modified package timestamp for new packages. + +UPDATE Packages SET ModifiedTS = SubmittedTS WHERE ModifiedTS = 0; + From 1.8.0 to 1.8.1 -------------------
diff --git a/web/html/pkgsubmit.php b/web/html/pkgsubmit.php index 131bf37..3eb60c5 100644 --- a/web/html/pkgsubmit.php +++ b/web/html/pkgsubmit.php @@ -347,7 +347,7 @@ if ($_COOKIE["AURSID"]): $uid = uid_from_sid($_COOKIE["AURSID"]);
# This is a brand new package - $q = sprintf("INSERT INTO Packages (Name, License, Version, CategoryID, Description, URL, SubmittedTS, SubmitterUID, MaintainerUID) VALUES ('%s', '%s', '%s-%s', %d, '%s', '%s', UNIX_TIMESTAMP(), %d, %d)", + $q = sprintf("INSERT INTO Packages (Name, License, Version, CategoryID, Description, URL, SubmittedTS, ModifiedTS, SubmitterUID, MaintainerUID) VALUES ('%s', '%s', '%s-%s', %d, '%s', '%s', UNIX_TIMESTAMP(), UNIX_TIMESTAMP(), %d, %d)", mysql_real_escape_string($new_pkgbuild['pkgname']), mysql_real_escape_string($new_pkgbuild['license']), mysql_real_escape_string($new_pkgbuild['pkgver']), diff --git a/web/lib/pkgfuncs.inc b/web/lib/pkgfuncs.inc index b38c7da..699a3a9 100644 --- a/web/lib/pkgfuncs.inc +++ b/web/lib/pkgfuncs.inc @@ -525,7 +525,7 @@ function pkg_search_page($SID="") { $q_sort = "ORDER BY Maintainer ".$order.", Name ASC "; break; case 'a': - $q_sort = "ORDER BY GREATEST(SubmittedTS,ModifiedTS) ".$order.", Name ASC "; + $q_sort = "ORDER BY ModifiedTS ".$order.", Name ASC "; break; default: break; diff --git a/web/lib/stats.inc b/web/lib/stats.inc index f42e417..f924fb5 100644 --- a/web/lib/stats.inc +++ b/web/lib/stats.inc @@ -36,7 +36,7 @@ function updates_table($dbh) global $apc_prefix, $apc_ttl; $key = $apc_prefix . 'recent_updates'; if(!(EXTENSION_LOADED_APC && ($newest_packages = apc_fetch($key)))) { - $q = 'SELECT * FROM Packages WHERE DummyPkg != 1 ORDER BY GREATEST(SubmittedTS,ModifiedTS) DESC LIMIT 0 , 10'; + $q = 'SELECT * FROM Packages WHERE DummyPkg != 1 ORDER BY ModifiedTS DESC LIMIT 0 , 10'; $result = db_query($q, $dbh);
$newest_packages = new ArrayObject(); @@ -84,7 +84,7 @@ function general_stats_table($dbh) $tu_count = db_cache_value($q, $dbh, $apc_prefix . 'tu_count');
$targstamp = intval(strtotime("-7 days")); - $q = "SELECT count(*) from Packages WHERE (Packages.SubmittedTS >= $targstamp OR Packages.ModifiedTS >= $targstamp)"; + $q = "SELECT count(*) from Packages WHERE Packages.ModifiedTS >= $targstamp"; $update_count = db_cache_value($q, $dbh, $apc_prefix . 'update_count');
include('stats/general_stats_table.php'); diff --git a/web/template/stats/updates_table.php b/web/template/stats/updates_table.php index b9f6d6d..a8cdf5a 100644 --- a/web/template/stats/updates_table.php +++ b/web/template/stats/updates_table.php @@ -20,12 +20,10 @@ $mod_int = intval($row["ModifiedTS"]); $sub_int = intval($row["SubmittedTS"]);
-if ($mod_int != 0): - $modstring = gmdate("r", $mod_int); -elseif ($sub_int != 0): +if ($mod_int == $sub_int): $modstring = '<img src="images/new.gif" alt="New!" /> ' . gmdate("r", $sub_int); else: - $modstring = '(unknown)'; + $modstring = gmdate("r", $mod_int); endif; ?>
@@ -36,4 +34,3 @@ endif; <?php endforeach; ?>
</table> - -- 1.7.4.2
On Wed, Mar 30, 2011 at 9:13 PM, elij
It might be worth while to add an index on ModifiedTS as well. That would make sorting by ModifiedTS really fast for generating the rss and recent packages lists (as well as sorting aur results by date).
Granted, adding another index would make insertion or updates a tad slower, but I am not sure *how much* slower.
Dan, do you have any thoughts on adding an index for ModifiedTS? It might be a nice trade off with the removal of the DummyPkg stuff (and related dummypkg index).
That conversation we had on IRC the other day was in direct relation to this- for the package list, we don't use the indexes at all because our query isn't exactly how MySQL likes it. The only way I could convince it to use a index was remove most of the joins and drop the ORDER BY condition down to one column. The recent packages list is cached too, so although it would be used there we don't run that query more than once every 5 minutes anyway. -Dan
On Wed, Mar 30, 2011 at 7:19 PM, Dan McGee
On Wed, Mar 30, 2011 at 9:13 PM, elij
wrote: It might be worth while to add an index on ModifiedTS as well. That would make sorting by ModifiedTS really fast for generating the rss and recent packages lists (as well as sorting aur results by date).
Granted, adding another index would make insertion or updates a tad slower, but I am not sure *how much* slower.
Dan, do you have any thoughts on adding an index for ModifiedTS? It might be a nice trade off with the removal of the DummyPkg stuff (and related dummypkg index).
That conversation we had on IRC the other day was in direct relation to this- for the package list, we don't use the indexes at all because our query isn't exactly how MySQL likes it. The only way I could convince it to use a index was remove most of the joins and drop the ORDER BY condition down to one column.
The recent packages list is cached too, so although it would be used there we don't run that query more than once every 5 minutes anyway.
I was writing a part of the spew backend that was setting the last modified http header to handle if-modified-since behavior, for generating the rss/atom feed in realtime instead of periodic. I used MAX on the ModifiedTS field to get the latest updated package's ModifiedTS, and used that as the http last modified value. The query planner was showing full table scans to find said max value (most recent update for any package). Adding an index on ModifiedTS of course made the query planner then show that it was using an index. However, the table isn't _that_ large, and a full table scan may even fit inside the query cache...so it may or may not be worthwhile. I had forgotten that the AUR caches the package feed for 5 minutes too. So yeah, adding that index probably wouldn't make much sense. :)
Instead, we just store dependencies directly in the PackageDepends
table. Since we don't use this info anywhere besides the package details
page, there is little value in precalculating what is in the AUR vs.
what is not.
An upgrade path is provided via several SQL statements in the UPGRADING
document. There should be no user-visible change from this, but the DB
schema gets a bit more sane and we no longer have loads of junk packages
in our tables that are never shown to the end user. This should also
help the MySQL query planner in several cases as we no longer have to be
careful to exclude dummy packages on every query.
Signed-off-by: Dan McGee
On Wed, Mar 30, 2011 at 08:48:09PM -0500, Dan McGee wrote:
Instead, we just store dependencies directly in the PackageDepends table. Since we don't use this info anywhere besides the package details page, there is little value in precalculating what is in the AUR vs. what is not.
An upgrade path is provided via several SQL statements in the UPGRADING document. There should be no user-visible change from this, but the DB schema gets a bit more sane and we no longer have loads of junk packages in our tables that are never shown to the end user. This should also help the MySQL query planner in several cases as we no longer have to be careful to exclude dummy packages on every query.
Signed-off-by: Dan McGee
--- UPGRADING | 12 ++++++++ support/schema/aur-schema.sql | 5 +-- support/scripts/newpackage-notify | 2 +- web/html/pkgsubmit.php | 17 ++-------- web/html/rss.php | 1 - web/lib/aur.inc | 3 +- web/lib/aurjson.class.php | 4 +- web/lib/pkgfuncs.inc | 56 +++++++----------------------------- web/lib/stats.inc | 4 +- web/template/pkg_details.php | 21 ++++--------- 10 files changed, 41 insertions(+), 84 deletions(-)
Just a short feedback: Those patches have already been in my working tree for a few days now. I will double-check if there are no side effects - especially with the dummy package removal - before merging and pushing tho (just to be sure there's no unexpected breakage, e.g. this one will remove all dummy package pages, like [1] - tho they're broken anyways and have probably never been used by anyone). [1] https://aur.archlinux.org/packages.php?ID=42
Signed-off-by: Dan McGee
Signed-off-by: Dan McGee
participants (4)
-
Dan McGee
-
Dan McGee
-
elij
-
Lukas Fleischer