[aur-dev] [PATCH 2/3] Add "mergepkgid" argument to pkg_delete()

Lukas Fleischer archlinux at cryptocrack.de
Wed Aug 10 18:48:49 EDT 2011


On Wed, Aug 10, 2011 at 05:35:46PM -0500, Dan McGee wrote:
> On Wed, Aug 10, 2011 at 4:51 PM, Lukas Fleischer
> <archlinux at cryptocrack.de> wrote:
> > On Tue, Aug 02, 2011 at 04:17:12PM +0200, Lukas Fleischer wrote:
> >> On Tue, Aug 02, 2011 at 07:45:40AM -0500, Dan McGee wrote:
> >> > On Tue, Aug 2, 2011 at 7:41 AM, Lukas Fleischer
> >> > <archlinux at cryptocrack.de> wrote:
> >> > > On Tue, Aug 02, 2011 at 07:13:05AM -0500, Dan McGee wrote:
> >> > >> On Sun, Jul 31, 2011 at 12:05 PM, Lukas Fleischer
> >> > >> <archlinux at cryptocrack.de> wrote:
> >> > >> > This allows for merging comments and votes of deleted packages into
> >> > >> > another one which is useful if a package needs to be renamed.
> >> > >> >
> >> > >> > Signed-off-by: Lukas Fleischer <archlinux at cryptocrack.de>
> >> > >> > ---
> >> > >> >  web/lib/pkgfuncs.inc.php |   24 +++++++++++++++++++++++-
> >> > >> >  1 files changed, 23 insertions(+), 1 deletions(-)
> >> > >> >
> >> > >> > diff --git a/web/lib/pkgfuncs.inc.php b/web/lib/pkgfuncs.inc.php
> >> > >> > index bb5a592..a81ee01 100644
> >> > >> > --- a/web/lib/pkgfuncs.inc.php
> >> > >> > +++ b/web/lib/pkgfuncs.inc.php
> >> > >> > @@ -659,10 +659,11 @@ function pkg_flag ($atype, $ids, $action = True) {
> >> > >> >  *
> >> > >> >  * @param string $atype Account type, output of account_from_sid
> >> > >> >  * @param array $ids Array of package IDs to delete
> >> > >> > + * @param int $mergepkgid Package to merge the deleted ones into
> >> > >> >  *
> >> > >> >  * @return string Translated error or success message
> >> > >> >  */
> >> > >> > -function pkg_delete ($atype, $ids) {
> >> > >> > +function pkg_delete ($atype, $ids, $mergepkgid) {
> >> > >> >        if (!$atype) {
> >> > >> >                return __("You must be logged in before you can delete packages.");
> >> > >> >        }
> >> > >> > @@ -678,6 +679,27 @@ function pkg_delete ($atype, $ids) {
> >> > >> >        }
> >> > >> >
> >> > >> >        $dbh = db_connect();
> >> > >> > +
> >> > >> > +       if ($mergepkgid) {
> >> > >> > +               /* Merge comments */
> >> > >> > +               $q = "UPDATE IGNORE PackageComments ";
> >> > >> Not a fan of this. This is not in any SQL standard so this would be
> >> > >> another thing needing adjusted later if someone tried to run using
> >> > >> non-MySQL. It is also extremely hacky IMO as you can end up
> >> > >> suppressing real errors, and read this gem from the manual:
> >> > >>     "Rows for which columns are updated to values that would cause
> >> > >> data conversion errors are updated to the closest valid values
> >> > >> instead."
> >> > >> Which just plain scares me.
> >> > >>
> >> > >> I'm not even sure why you are doing IGNORE on the comments merge
> >> > >> (these can never conflict?), but for the votes merge, simply do a
> >> > >> * UPDATE where it doesn't already exist.
> >> > >> * DELETE all remaining.
> >> > >
> >> > > Well, I actually can't remember what I was thinking when I wrote the
> >> > > comments query. I probably just copy-pasted it. That's the best excuse I
> >> > > can come up with, yeah :)
> >> > >
> >> > > I'll fix this and think of a better filter criteria for the votes merge
> >> > > query. I don't think we need the delete all remaining (duplicate) votes
> >> > > though, as we do remove the associated packages anyway and "ON DELETE"
> >> > > should do the rest.
> >> >
> >> > Ahh, true on the auto-cascade DELETE thoughts. If you just add a
> >> > little more logic to the WHERE clause you should be able to make the
> >> > UPDATES selective enough to avoid the duplicates issue.
> >>
> >> Yeah. Well, the issue here is that there might also be duplicates within
> >> the deleted packages (e.g. if a user voted on two of the packages that
> >> we want to merge but didn't vote on the target package itself), so for
> >> each of the affected votes we would have to:
> >>
> >> 1) Check if the user already voted on the target package. If so, discard
> >>    the vote. An alternative approach is to exclude all votes that
> >>    originate from a user who has voted on the target package as well
> >>    right from the start.
> >>
> >> 2) Check if there's another vote from the same user on any of the other
> >>    packages in the remove queue. If so, only move one of them to the new
> >>    package. Maybe we could also group by user and remove such duplicates
> >>    beforehand.
> >>
> >> We also have the limitation that MySQL doesn't allow to UPDATE a table
> >> and SELECT from the same table in a subquery. So we will either have do
> >> use deep nested subqueries, a self join or a temporary table here.
> >>
> >> Another idea that just came into my mind is that we could also extract
> >> all potentially affected votes, group them by user, delete all records
> >> but one from each group and finally update the remaining record to match
> >> the target package's ID. Not sure if that can be implemented as a single
> >> SQL query.
> >>
> >> I'll think about how to implement that in detail when ever I get around
> >> to doing it...
> >
> > I tried implementing that in a single query and ended up with a bunch of
> > nested subqueries which were rather confusing. Maybe we should really
> > consider using a temporary table here - particularly since this will be
> > a function that will be rarely used and performance shouldn't be that
> > significant here (as opposed to maintainability).
> >
> > If anyone can come up with a simple query that covers all aspects,
> > suggestions welcome!
> 
> -- original query
> UPDATE PackageVotes
> SET PackageID = ?
> WHERE PackageID IN (?, ?, ?);
> 
> -- new queries
> -- only update votes that wouldn't exist after modification
> UPDATE PackageVotes
> SET PackageID = ?merge
> WHERE PackageID IN (?others)
> AND UsersID NOT IN (
>         -- this silly double SELECT works around shitty MySQL not
> letting you reference the updated table in the update, as it forces
> materialization of the subquery
> 	SELECT * FROM (
> 		SELECT UsersID
> 		FROM PackageVotes
> 		WHERE PackageID = ?merge
> 	) temp
> );

That's what I had when I wrote the query for the very first time but
that won't work if we merge more than one package at once. Read my
next-to-last email in this thread for some more details.

> 
> -- all remaining un-updated votes are dupes, delete them
> DELETE FROM PackageVotes
> WHERE PackageID IN (?, ?, ?);

No need to do this. ON DELETE CASCADE, remember? :)

> 
> 
> Note that this whole deletion stuff should be done in one transaction
> so the view is consistent; I'd add that if you can. On that note, so
> should package creation/update if we're not doing it already...


More information about the aur-dev mailing list