[pacman-dev] Another idea

Wed Jan 14 21:02:13 EST 2009

Hey all.  I've been thinking about my patches and the concerns that list 
members have had about them and I've come up with a different way of 
accomplishing the same thing that may be more palatable:

- Modify the signature of deps.c:_alpm_resolvedeps so that it takes, 
instead of a list of packages to resolve, a single package.  It would 
return 0 if the resolve succeeded, 1 if the resolve failed, and would 
additionally return a list of dependencies that this package needs for 
this sync.  In the parlance of _alpm_resolvedeps, it would return the 
'pulled' list.  _alpm_resolvedeps would, just as it does now, fail 
immediately when it encountered an unresolvable dependency, and would 
not require the graph structure to keep track of dependents.  It would 
be a much simpler change to _alpm_resolvedeps.

- Modify sync.c:_alpm_sync_prepare so that it calls _alpm_resolvdeps 
once per package in the transaction, rather than calling 
_alpm_resolvedeps once with a list of the packages in the transaction as 
it does now.  As _alpm_resolvedeps is called for each package in the 
transaction, three lists will be built up in _alpm_sync_prepare:

1. The list of transaction packages (i.e. the "top-level" packages) that 
were successfully resolved.

2. The list of transaction packages (i.e. the "top-level" packages) that 
were NOT successfully resolved.

3. The list of "pulled" packages that came from _alpm_resolvedeps for 
each transaction package that was successfully resolved.

After all of the resolution has been done, if the list in (2) is not 
empty, then these packages are used in the new prompt that asks the user 
"do you want to ignore these packages for this transaction", and if they 
say "no", then _alpm_sync_prepare will return a failure status code.  If 
they say "yes", then list (2) will be freed and list (1) and (3) will be 
used to finish _alpm_sync_prepare as normal (after removing duplicates 
from (3) as necessary).

The advantage of this approach is that it doesn't require any new data 
structures or helper functions, and changes _alpm_resolvedeps very little.

The disadvantage of this approach is that it is less efficient at 
runtime.  Each top-level package will be resolved individually, which 
means that some work will be duplicated (i.e. if top-level A and B 
depend on C, and C depends on D and E, then in this new scheme, the 
C-D-E dependencies will be calculated once for A and once for B, whereas 
before, it was only calculated once for A, and then the results re-used 
for B).  I would only expect this to have a significant perfomance 
penalty in large transactions (say a pacman -Su where hundreds of 
packages were being updated), and I'm not sure exactly how "large" the 
transaction would have to be before the penalty became unreasonable.

I *think* these are the only two approaches:

1. Use a data structure to track dependents, thus gaining runtime 
efficiency during the resolve of a list of packages, at the expense of 
extra complexity in the implementation
2. Use no new data structures, instead iterating over each package in 
the transaction individually, thus gaining simplicity of implementation, 
at the expense of runtime efficiency

I have already implemented (1).  If the concern about the code 
complexity outweighs the concern about performance, then I can implement 
(2) instead.

Please let me know how to proceed.  Thanks.

Bryan

T