[aur-dev] [PATCH 2/3] Add "mergepkgid" argument to pkg_delete()

Dan McGee dpmcgee at gmail.com
Wed Aug 10 18:35:46 EDT 2011


On Wed, Aug 10, 2011 at 4:51 PM, Lukas Fleischer
<archlinux at cryptocrack.de> wrote:
> On Tue, Aug 02, 2011 at 04:17:12PM +0200, Lukas Fleischer wrote:
>> On Tue, Aug 02, 2011 at 07:45:40AM -0500, Dan McGee wrote:
>> > On Tue, Aug 2, 2011 at 7:41 AM, Lukas Fleischer
>> > <archlinux at cryptocrack.de> wrote:
>> > > On Tue, Aug 02, 2011 at 07:13:05AM -0500, Dan McGee wrote:
>> > >> On Sun, Jul 31, 2011 at 12:05 PM, Lukas Fleischer
>> > >> <archlinux at cryptocrack.de> wrote:
>> > >> > This allows for merging comments and votes of deleted packages into
>> > >> > another one which is useful if a package needs to be renamed.
>> > >> >
>> > >> > Signed-off-by: Lukas Fleischer <archlinux at cryptocrack.de>
>> > >> > ---
>> > >> >  web/lib/pkgfuncs.inc.php |   24 +++++++++++++++++++++++-
>> > >> >  1 files changed, 23 insertions(+), 1 deletions(-)
>> > >> >
>> > >> > diff --git a/web/lib/pkgfuncs.inc.php b/web/lib/pkgfuncs.inc.php
>> > >> > index bb5a592..a81ee01 100644
>> > >> > --- a/web/lib/pkgfuncs.inc.php
>> > >> > +++ b/web/lib/pkgfuncs.inc.php
>> > >> > @@ -659,10 +659,11 @@ function pkg_flag ($atype, $ids, $action = True) {
>> > >> >  *
>> > >> >  * @param string $atype Account type, output of account_from_sid
>> > >> >  * @param array $ids Array of package IDs to delete
>> > >> > + * @param int $mergepkgid Package to merge the deleted ones into
>> > >> >  *
>> > >> >  * @return string Translated error or success message
>> > >> >  */
>> > >> > -function pkg_delete ($atype, $ids) {
>> > >> > +function pkg_delete ($atype, $ids, $mergepkgid) {
>> > >> >        if (!$atype) {
>> > >> >                return __("You must be logged in before you can delete packages.");
>> > >> >        }
>> > >> > @@ -678,6 +679,27 @@ function pkg_delete ($atype, $ids) {
>> > >> >        }
>> > >> >
>> > >> >        $dbh = db_connect();
>> > >> > +
>> > >> > +       if ($mergepkgid) {
>> > >> > +               /* Merge comments */
>> > >> > +               $q = "UPDATE IGNORE PackageComments ";
>> > >> Not a fan of this. This is not in any SQL standard so this would be
>> > >> another thing needing adjusted later if someone tried to run using
>> > >> non-MySQL. It is also extremely hacky IMO as you can end up
>> > >> suppressing real errors, and read this gem from the manual:
>> > >>     "Rows for which columns are updated to values that would cause
>> > >> data conversion errors are updated to the closest valid values
>> > >> instead."
>> > >> Which just plain scares me.
>> > >>
>> > >> I'm not even sure why you are doing IGNORE on the comments merge
>> > >> (these can never conflict?), but for the votes merge, simply do a
>> > >> * UPDATE where it doesn't already exist.
>> > >> * DELETE all remaining.
>> > >
>> > > Well, I actually can't remember what I was thinking when I wrote the
>> > > comments query. I probably just copy-pasted it. That's the best excuse I
>> > > can come up with, yeah :)
>> > >
>> > > I'll fix this and think of a better filter criteria for the votes merge
>> > > query. I don't think we need the delete all remaining (duplicate) votes
>> > > though, as we do remove the associated packages anyway and "ON DELETE"
>> > > should do the rest.
>> >
>> > Ahh, true on the auto-cascade DELETE thoughts. If you just add a
>> > little more logic to the WHERE clause you should be able to make the
>> > UPDATES selective enough to avoid the duplicates issue.
>>
>> Yeah. Well, the issue here is that there might also be duplicates within
>> the deleted packages (e.g. if a user voted on two of the packages that
>> we want to merge but didn't vote on the target package itself), so for
>> each of the affected votes we would have to:
>>
>> 1) Check if the user already voted on the target package. If so, discard
>>    the vote. An alternative approach is to exclude all votes that
>>    originate from a user who has voted on the target package as well
>>    right from the start.
>>
>> 2) Check if there's another vote from the same user on any of the other
>>    packages in the remove queue. If so, only move one of them to the new
>>    package. Maybe we could also group by user and remove such duplicates
>>    beforehand.
>>
>> We also have the limitation that MySQL doesn't allow to UPDATE a table
>> and SELECT from the same table in a subquery. So we will either have do
>> use deep nested subqueries, a self join or a temporary table here.
>>
>> Another idea that just came into my mind is that we could also extract
>> all potentially affected votes, group them by user, delete all records
>> but one from each group and finally update the remaining record to match
>> the target package's ID. Not sure if that can be implemented as a single
>> SQL query.
>>
>> I'll think about how to implement that in detail when ever I get around
>> to doing it...
>
> I tried implementing that in a single query and ended up with a bunch of
> nested subqueries which were rather confusing. Maybe we should really
> consider using a temporary table here - particularly since this will be
> a function that will be rarely used and performance shouldn't be that
> significant here (as opposed to maintainability).
>
> If anyone can come up with a simple query that covers all aspects,
> suggestions welcome!

-- original query
UPDATE PackageVotes
SET PackageID = ?
WHERE PackageID IN (?, ?, ?);

-- new queries
-- only update votes that wouldn't exist after modification
UPDATE PackageVotes
SET PackageID = ?merge
WHERE PackageID IN (?others)
AND UsersID NOT IN (
        -- this silly double SELECT works around shitty MySQL not
letting you reference the updated table in the update, as it forces
materialization of the subquery
	SELECT * FROM (
		SELECT UsersID
		FROM PackageVotes
		WHERE PackageID = ?merge
	) temp
);

-- all remaining un-updated votes are dupes, delete them
DELETE FROM PackageVotes
WHERE PackageID IN (?, ?, ?);


Note that this whole deletion stuff should be done in one transaction
so the view is consistent; I'd add that if you can. On that note, so
should package creation/update if we're not doing it already...

-Dan


More information about the aur-dev mailing list