On 25 February 2011 11:13, Dan McGee <dpmcgee@gmail.com> wrote:
On Sat, Feb 19, 2011 at 6:11 PM, Tavian Barnes <tavianator@tavianator.com> wrote:
On a related note, I just tried running the test suite after entirely patching out integrity checks, and there weren't any regressions. Maybe the test suite should test the handling of corrupt packages? I can add a test case myself if you want, once I've figured out how the test suite works.
Tests for this would definitely be nice. You will probably have to add the ability to pactest to create a broken package and/or database entry.
I'll have a look at that.
Steps I know of and notes about them: * "Checking integrity" is really two things- md5sum iterations on the file, and then an alpm_pkg_load() call to build the package object and create the filelist. Not sure how you incorporated this but at least something to think about.
The current patchset runs both those steps in parallel; everything else in that loop is protected by a mutex. alpm_pkg_load() takes significantly longer than the md5sum check, by the way.
* We do yet another iteration of all of the package contents if diskspace checking is enabled and read through the archive. This could be eliminated if we grabbed the necessary data in pkg_load, which I believe is simply some parts of the stat buffer and the type of the entry. This would also be hugely helpful in conflict checking, where we don't have this info available, and you will see some comments alluding to the "12 checks we do in add.c" or something.
I'll look into this.
* Downloads. I see a call to do this in parallel a lot and I will continue to think this is stupid, but maybe that is just me. If you can't find a mirror that saturates your connection, look around- we have a lot.
That seems like a bad idea, I agree with you. But while we're downloading we could probably be doing a bunch of work in the background, including integrity checks. The next version of the patchset I post will do this.
* File conflicts- we've made this one pretty damn fast already, so probably not worth parallelizing.
Agreed, I've never seen this take up a significant portion of the time it takes to -Syu. -- Tavian Barnes