[pacman-dev] [PATCH] Reduce the negligence of _alpm_resolvedeps

Nagy Gabor ngaba at bibl.u-szeged.hu
Sun Dec 2 17:03:21 EST 2007


Idézés Xavier <shiningxc at gmail.com>:

> On Sat, Dec 01, 2007 at 12:06:59PM +0100, Xavier wrote:
> > On Sat, Dec 01, 2007 at 12:24:16AM +0100, Nagy Gabor wrote:
> > > > On Thu, Nov 29, 2007 at 10:13:42PM +0100, Nagy Gabor wrote:
> > > > > OK, but I leave the benchmark for you;-)
> > > > > Note: Long "dependency paths" like in smoke001.py may cause notable
> > > > > slowdown. I made a quick "hack", so probably I made some mistakes...
> > > > > 
> > > > 
> > > > I couldn't measure any difference. pacman always need ~3.5 seconds for
> > > > running smoke001.
> > > > 
> > > Because it is an add transaction;-) I just said that it contains a long
> > > dependency path, which can cause notable slowdown in case of -S
> > > first_element_of_path. (Unfortunately I cannot write such tricky
> > > pactests in python).
> > > 
> > 
> > Oops, my bad. In my defense, it was a bit late :)
> > But now, I'm slightly confused. How hard is it to edit smoke001 to do what
> > you want? It looks like a totally trivial change to me. So I'm not sure I
> > understood what you wanted.
> > 
> 
> So, I made a simple change to smoke001 :
> 
> ---------------------------------------------------------------------------
> self.description = "Install a thousand packages in a single transaction"
> 
> p = pmpkg("pkg1000")
> 
> self.addpkg2db("local", p)
> 
> for i in range(1000):
> 	p = pmpkg("pkg%03d" % i)
> 	p.depends = ["pkg%03d" % (i+1)]
> 	p.files = ["usr/share/pkg%03d" % i]
> 	self.addpkg2db("sync", p)
> 
> self.args = "-S pkg001"
> 
> self.addrule("PACMAN_RETCODE=0")
> self.addrule("PKG_EXIST=pkg050")
> self.addrule("PKG_EXIST=pkg674")
> self.addrule("PKG_EXIST=pkg999")
> ---------------------------------------------------------------------------
> 
> Your resolvedeps patch indeed made its complexity explode :) I couldn't even
> wait long enough for it to finish.
> So I reduced the number to 100 packages, so that it could handle it in a few
> seconds:
> 
> ---------------------------------------------------------------------------
> self.description = "Install a thousand packages in a single transaction"
> 
> p = pmpkg("pkg100")
> 
> self.addpkg2db("local", p)
> 
> for i in range(100):
> 	p = pmpkg("pkg%02d" % i)
> 	p.depends = ["pkg%02d" % (i+1)]
> 	p.files = ["usr/share/pkg%02d" % i]
> 	self.addpkg2db("sync", p)
> 
> self.args = "-S pkg01"
> 
> self.addrule("PACMAN_RETCODE=0")
> self.addrule("PKG_EXIST=pkg50")
> self.addrule("PKG_EXIST=pkg74")
> self.addrule("PKG_EXIST=pkg99")
> ---------------------------------------------------------------------------

Well. Other bad news:
I'm pretty sure now, that -Rc will also explode with such long dependency 
paths (all pkg* is installed, and you do a -Rc pkg1000), what's more, probably 
that would be much slower, because:
0. this is the "inverse" of my patched resolvedeps, just we use requiredbys 
here <- we should move -Rc handling to deps.c and rewrite to resolvedeps-style.
1. compute_requiredby (explicit or implicit alpm_checkdeps) !!
2. the ugly list_diff(pkgcache, dblist) <- this is about O(n^2) ~ 1.000.000 
for each (about 1000) checkdeps call

And we've just found a bad thing with compute_requiredby usage: you shouldn't 
compute it too many times for the same package, because you will get an 
extreme slowdown! [Well, originally I would have kept pkg_getrequiredby and 
compute-and-store it on demand, like in pkg_getdepends].



----------------------------------------------------
SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu
This mail sent through IMP: http://horde.org/imp/





More information about the pacman-dev mailing list