Oh I get it. Well, the code in both is pretty similar, but git seems to support HPUX while x264 doesn't. Also, git just uses the number of online processors, so "taskset 0x1 git gc" runs N threads at once on 1 core, while x264 uses the process's CPU affinity on Linux and Windows to behave better in that case. Yeah, that seems nice but also overkill- git has a config var to limit
On Sat, Feb 19, 2011 at 6:11 PM, Tavian Barnes <tavianator@tavianator.com> wrote: thread count; we could add that to the config file and just fall back to online_cpus() if not provided.
Anyway, that's not the main point. Are you guys interested in this change? I'm almost done a better version of the patch that adds an _alpm_for_each_cpu() function (to util.h) which takes a callback function to call in N threads.
On a related note, I just tried running the test suite after entirely patching out integrity checks, and there weren't any regressions. Maybe the test suite should test the handling of corrupt packages? I can add a test case myself if you want, once I've figured out how the test suite works.
Tests for this would definitely be nice. You will probably have to add the ability to pactest to create a broken package and/or database entry. I don't want to review these just yet, as I want to focus my time on 3.5.0 releasing. I will add this and maybe you can take it into account in the patchset- we do a lot of things we could parallelize and/or combine. Steps I know of and notes about them: * "Checking integrity" is really two things- md5sum iterations on the file, and then an alpm_pkg_load() call to build the package object and create the filelist. Not sure how you incorporated this but at least something to think about. * We do yet another iteration of all of the package contents if diskspace checking is enabled and read through the archive. This could be eliminated if we grabbed the necessary data in pkg_load, which I believe is simply some parts of the stat buffer and the type of the entry. This would also be hugely helpful in conflict checking, where we don't have this info available, and you will see some comments alluding to the "12 checks we do in add.c" or something. * Downloads. I see a call to do this in parallel a lot and I will continue to think this is stupid, but maybe that is just me. If you can't find a mirror that saturates your connection, look around- we have a lot. * File conflicts- we've made this one pretty damn fast already, so probably not worth parallelizing. -Dan