[pacman-dev] Another idea

Thu Jan 15 16:08:05 EST 2009

> Hey all.  I've been thinking about my patches and the concerns that
> list members have had about them and I've come up with a different
> way of accomplishing the same thing that may be more palatable:
> 
> - Modify the signature of deps.c:_alpm_resolvedeps so that it takes, 
> instead of a list of packages to resolve, a single package.  It would 
> return 0 if the resolve succeeded, 1 if the resolve failed, and would 
> additionally return a list of dependencies that this package needs
> for this sync.  In the parlance of _alpm_resolvedeps, it would return
> the 'pulled' list.  _alpm_resolvedeps would, just as it does now,
> fail immediately when it encountered an unresolvable dependency, and
> would not require the graph structure to keep track of dependents.
> It would be a much simpler change to _alpm_resolvedeps.

This is a good idea imho.

>
> - Modify sync.c:_alpm_sync_prepare so that it calls _alpm_resolvdeps 
> once per package in the transaction, rather than calling 
> _alpm_resolvedeps once with a list of the packages in the transaction
> as it does now.  As _alpm_resolvedeps is called for each package in
> the transaction, three lists will be built up in _alpm_sync_prepare:
> 
> 1. The list of transaction packages (i.e. the "top-level" packages)
> that were successfully resolved.
> 
> 2. The list of transaction packages (i.e. the "top-level" packages)
> that were NOT successfully resolved.
> 
> 3. The list of "pulled" packages that came from _alpm_resolvedeps for 
> each transaction package that was successfully resolved.
> 
> After all of the resolution has been done, if the list in (2) is not 
> empty, then these packages are used in the new prompt that asks the
> user "do you want to ignore these packages for this transaction", and
> if they say "no", then _alpm_sync_prepare will return a failure
> status code.  If they say "yes", then list (2) will be freed and list
> (1) and (3) will be used to finish _alpm_sync_prepare as normal
> (after removing duplicates from (3) as necessary).
> 
> The advantage of this approach is that it doesn't require any new
> data structures or helper functions, and changes _alpm_resolvedeps
> very little.
> 
> The disadvantage of this approach is that it is less efficient at 
> runtime.  Each top-level package will be resolved individually, which 
> means that some work will be duplicated (i.e. if top-level A and B 
> depend on C, and C depends on D and E, then in this new scheme, the 
> C-D-E dependencies will be calculated once for A and once for B,
> whereas before, it was only calculated once for A, and then the
> results re-used for B).  I would only expect this to have a
> significant perfomance penalty in large transactions (say a pacman
> -Su where hundreds of packages were being updated), and I'm not sure
> exactly how "large" the transaction would have to be before the
> penalty became unreasonable.
> 

First of all, if you investigate a package 5 times, pacman may ask the
user 5 times, whether he wants to install a package from IgnorePkg (if
we switch to non-interactive implementation, user will see 5 warnings).
This is odd.

However, if we are tricky, we can almost eliminate this problem imho.
(Although we must complicate the code a bit.) If you can *successfully*
resolve a target package, you _will_ keep all the pulled targets, so you
can push them to the ~resolved_targets list (which should be checked
first, when you resolve dependencies!*). In real life, we can resolve
all targets ~always. Your described problem would appear around
unresolvable targets, which is not a common situation.

*Note: We prefer target list over pkgcache during dependency
resolving. I mean, if you do "pacman -S mplayer libgl-dri" we will
select libgl-dri as a libgl dependency satisfier of mplayer. So if
possible you should select a satisfier package from the
(explicit and resolved_) target list.

> I *think* these are the only two approaches:
> 
> 1. Use a data structure to track dependents, thus gaining runtime 
> efficiency during the resolve of a list of packages, at the expense
> of extra complexity in the implementation
> 2. Use no new data structures, instead iterating over each package in 
> the transaction individually, thus gaining simplicity of
> implementation, at the expense of runtime efficiency
> 
> I have already implemented (1).  If the concern about the code 
> complexity outweighs the concern about performance, then I can
> implement (2) instead.
> 
> Please let me know how to proceed.  Thanks.
> 
> Bryan
>