On Wed, Aug 10, 2011 at 5:48 PM, Lukas Fleischer <archlinux@cryptocrack.de> wrote:
On Wed, Aug 10, 2011 at 05:35:46PM -0500, Dan McGee wrote:
On Wed, Aug 10, 2011 at 4:51 PM, Lukas Fleischer <archlinux@cryptocrack.de> wrote:
On Tue, Aug 02, 2011 at 04:17:12PM +0200, Lukas Fleischer wrote:
On Tue, Aug 02, 2011 at 07:45:40AM -0500, Dan McGee wrote:
On Tue, Aug 2, 2011 at 7:41 AM, Lukas Fleischer <archlinux@cryptocrack.de> wrote:
On Tue, Aug 02, 2011 at 07:13:05AM -0500, Dan McGee wrote: > On Sun, Jul 31, 2011 at 12:05 PM, Lukas Fleischer > <archlinux@cryptocrack.de> wrote: > > This allows for merging comments and votes of deleted packages into > > another one which is useful if a package needs to be renamed. > > > > Signed-off-by: Lukas Fleischer <archlinux@cryptocrack.de> > > --- > > web/lib/pkgfuncs.inc.php | 24 +++++++++++++++++++++++- > > 1 files changed, 23 insertions(+), 1 deletions(-) > > > > diff --git a/web/lib/pkgfuncs.inc.php b/web/lib/pkgfuncs.inc.php > > index bb5a592..a81ee01 100644 > > --- a/web/lib/pkgfuncs.inc.php > > +++ b/web/lib/pkgfuncs.inc.php > > @@ -659,10 +659,11 @@ function pkg_flag ($atype, $ids, $action = True) { > > * > > * @param string $atype Account type, output of account_from_sid > > * @param array $ids Array of package IDs to delete > > + * @param int $mergepkgid Package to merge the deleted ones into > > * > > * @return string Translated error or success message > > */ > > -function pkg_delete ($atype, $ids) { > > +function pkg_delete ($atype, $ids, $mergepkgid) { > > if (!$atype) { > > return __("You must be logged in before you can delete packages."); > > } > > @@ -678,6 +679,27 @@ function pkg_delete ($atype, $ids) { > > } > > > > $dbh = db_connect(); > > + > > + if ($mergepkgid) { > > + /* Merge comments */ > > + $q = "UPDATE IGNORE PackageComments "; > Not a fan of this. This is not in any SQL standard so this would be > another thing needing adjusted later if someone tried to run using > non-MySQL. It is also extremely hacky IMO as you can end up > suppressing real errors, and read this gem from the manual: > "Rows for which columns are updated to values that would cause > data conversion errors are updated to the closest valid values > instead." > Which just plain scares me. > > I'm not even sure why you are doing IGNORE on the comments merge > (these can never conflict?), but for the votes merge, simply do a > * UPDATE where it doesn't already exist. > * DELETE all remaining.
Well, I actually can't remember what I was thinking when I wrote the comments query. I probably just copy-pasted it. That's the best excuse I can come up with, yeah :)
I'll fix this and think of a better filter criteria for the votes merge query. I don't think we need the delete all remaining (duplicate) votes though, as we do remove the associated packages anyway and "ON DELETE" should do the rest.
Ahh, true on the auto-cascade DELETE thoughts. If you just add a little more logic to the WHERE clause you should be able to make the UPDATES selective enough to avoid the duplicates issue.
Yeah. Well, the issue here is that there might also be duplicates within the deleted packages (e.g. if a user voted on two of the packages that we want to merge but didn't vote on the target package itself), so for each of the affected votes we would have to:
1) Check if the user already voted on the target package. If so, discard the vote. An alternative approach is to exclude all votes that originate from a user who has voted on the target package as well right from the start.
2) Check if there's another vote from the same user on any of the other packages in the remove queue. If so, only move one of them to the new package. Maybe we could also group by user and remove such duplicates beforehand.
We also have the limitation that MySQL doesn't allow to UPDATE a table and SELECT from the same table in a subquery. So we will either have do use deep nested subqueries, a self join or a temporary table here.
Another idea that just came into my mind is that we could also extract all potentially affected votes, group them by user, delete all records but one from each group and finally update the remaining record to match the target package's ID. Not sure if that can be implemented as a single SQL query.
I'll think about how to implement that in detail when ever I get around to doing it...
I tried implementing that in a single query and ended up with a bunch of nested subqueries which were rather confusing. Maybe we should really consider using a temporary table here - particularly since this will be a function that will be rarely used and performance shouldn't be that significant here (as opposed to maintainability).
If anyone can come up with a simple query that covers all aspects, suggestions welcome!
-- original query UPDATE PackageVotes SET PackageID = ? WHERE PackageID IN (?, ?, ?);
-- new queries -- only update votes that wouldn't exist after modification UPDATE PackageVotes SET PackageID = ?merge WHERE PackageID IN (?others) AND UsersID NOT IN ( -- this silly double SELECT works around shitty MySQL not letting you reference the updated table in the update, as it forces materialization of the subquery SELECT * FROM ( SELECT UsersID FROM PackageVotes WHERE PackageID = ?merge ) temp );
That's what I had when I wrote the query for the very first time but that won't work if we merge more than one package at once. Read my next-to-last email in this thread for some more details.
-- all remaining un-updated votes are dupes, delete them DELETE FROM PackageVotes WHERE PackageID IN (?, ?, ?);
No need to do this. ON DELETE CASCADE, remember? :) I know, but explicit >> implicit, especially once you wrap it in a
Oh damn it, I thought I set up my test data right- I'll see what I can tweak. transaction, I don't know. -Dan