Hey all. I've been thinking about my patches and the concerns that list members have had about them and I've come up with a different way of accomplishing the same thing that may be more palatable: - Modify the signature of deps.c:_alpm_resolvedeps so that it takes, instead of a list of packages to resolve, a single package. It would return 0 if the resolve succeeded, 1 if the resolve failed, and would additionally return a list of dependencies that this package needs for this sync. In the parlance of _alpm_resolvedeps, it would return the 'pulled' list. _alpm_resolvedeps would, just as it does now, fail immediately when it encountered an unresolvable dependency, and would not require the graph structure to keep track of dependents. It would be a much simpler change to _alpm_resolvedeps. - Modify sync.c:_alpm_sync_prepare so that it calls _alpm_resolvdeps once per package in the transaction, rather than calling _alpm_resolvedeps once with a list of the packages in the transaction as it does now. As _alpm_resolvedeps is called for each package in the transaction, three lists will be built up in _alpm_sync_prepare: 1. The list of transaction packages (i.e. the "top-level" packages) that were successfully resolved. 2. The list of transaction packages (i.e. the "top-level" packages) that were NOT successfully resolved. 3. The list of "pulled" packages that came from _alpm_resolvedeps for each transaction package that was successfully resolved. After all of the resolution has been done, if the list in (2) is not empty, then these packages are used in the new prompt that asks the user "do you want to ignore these packages for this transaction", and if they say "no", then _alpm_sync_prepare will return a failure status code. If they say "yes", then list (2) will be freed and list (1) and (3) will be used to finish _alpm_sync_prepare as normal (after removing duplicates from (3) as necessary). The advantage of this approach is that it doesn't require any new data structures or helper functions, and changes _alpm_resolvedeps very little. The disadvantage of this approach is that it is less efficient at runtime. Each top-level package will be resolved individually, which means that some work will be duplicated (i.e. if top-level A and B depend on C, and C depends on D and E, then in this new scheme, the C-D-E dependencies will be calculated once for A and once for B, whereas before, it was only calculated once for A, and then the results re-used for B). I would only expect this to have a significant perfomance penalty in large transactions (say a pacman -Su where hundreds of packages were being updated), and I'm not sure exactly how "large" the transaction would have to be before the penalty became unreasonable. I *think* these are the only two approaches: 1. Use a data structure to track dependents, thus gaining runtime efficiency during the resolve of a list of packages, at the expense of extra complexity in the implementation 2. Use no new data structures, instead iterating over each package in the transaction individually, thus gaining simplicity of implementation, at the expense of runtime efficiency I have already implemented (1). If the concern about the code complexity outweighs the concern about performance, then I can implement (2) instead. Please let me know how to proceed. Thanks. Bryan T
Hey all. I've been thinking about my patches and the concerns that list members have had about them and I've come up with a different way of accomplishing the same thing that may be more palatable:
- Modify the signature of deps.c:_alpm_resolvedeps so that it takes, instead of a list of packages to resolve, a single package. It would return 0 if the resolve succeeded, 1 if the resolve failed, and would additionally return a list of dependencies that this package needs for this sync. In the parlance of _alpm_resolvedeps, it would return the 'pulled' list. _alpm_resolvedeps would, just as it does now, fail immediately when it encountered an unresolvable dependency, and would not require the graph structure to keep track of dependents. It would be a much simpler change to _alpm_resolvedeps.
This is a good idea imho.
- Modify sync.c:_alpm_sync_prepare so that it calls _alpm_resolvdeps once per package in the transaction, rather than calling _alpm_resolvedeps once with a list of the packages in the transaction as it does now. As _alpm_resolvedeps is called for each package in the transaction, three lists will be built up in _alpm_sync_prepare:
1. The list of transaction packages (i.e. the "top-level" packages) that were successfully resolved.
2. The list of transaction packages (i.e. the "top-level" packages) that were NOT successfully resolved.
3. The list of "pulled" packages that came from _alpm_resolvedeps for each transaction package that was successfully resolved.
After all of the resolution has been done, if the list in (2) is not empty, then these packages are used in the new prompt that asks the user "do you want to ignore these packages for this transaction", and if they say "no", then _alpm_sync_prepare will return a failure status code. If they say "yes", then list (2) will be freed and list (1) and (3) will be used to finish _alpm_sync_prepare as normal (after removing duplicates from (3) as necessary).
The advantage of this approach is that it doesn't require any new data structures or helper functions, and changes _alpm_resolvedeps very little.
The disadvantage of this approach is that it is less efficient at runtime. Each top-level package will be resolved individually, which means that some work will be duplicated (i.e. if top-level A and B depend on C, and C depends on D and E, then in this new scheme, the C-D-E dependencies will be calculated once for A and once for B, whereas before, it was only calculated once for A, and then the results re-used for B). I would only expect this to have a significant perfomance penalty in large transactions (say a pacman -Su where hundreds of packages were being updated), and I'm not sure exactly how "large" the transaction would have to be before the penalty became unreasonable.
First of all, if you investigate a package 5 times, pacman may ask the user 5 times, whether he wants to install a package from IgnorePkg (if we switch to non-interactive implementation, user will see 5 warnings). This is odd. However, if we are tricky, we can almost eliminate this problem imho. (Although we must complicate the code a bit.) If you can *successfully* resolve a target package, you _will_ keep all the pulled targets, so you can push them to the ~resolved_targets list (which should be checked first, when you resolve dependencies!*). In real life, we can resolve all targets ~always. Your described problem would appear around unresolvable targets, which is not a common situation. *Note: We prefer target list over pkgcache during dependency resolving. I mean, if you do "pacman -S mplayer libgl-dri" we will select libgl-dri as a libgl dependency satisfier of mplayer. So if possible you should select a satisfier package from the (explicit and resolved_) target list.
I *think* these are the only two approaches:
1. Use a data structure to track dependents, thus gaining runtime efficiency during the resolve of a list of packages, at the expense of extra complexity in the implementation 2. Use no new data structures, instead iterating over each package in the transaction individually, thus gaining simplicity of implementation, at the expense of runtime efficiency
I have already implemented (1). If the concern about the code complexity outweighs the concern about performance, then I can implement (2) instead.
Please let me know how to proceed. Thanks.
Bryan
On Thu, Jan 15, 2009 at 10:08 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Hey all. I've been thinking about my patches and the concerns that list members have had about them and I've come up with a different way of accomplishing the same thing that may be more palatable:
- Modify the signature of deps.c:_alpm_resolvedeps so that it takes, instead of a list of packages to resolve, a single package. It would return 0 if the resolve succeeded, 1 if the resolve failed, and would additionally return a list of dependencies that this package needs for this sync. In the parlance of _alpm_resolvedeps, it would return the 'pulled' list. _alpm_resolvedeps would, just as it does now, fail immediately when it encountered an unresolvable dependency, and would not require the graph structure to keep track of dependents. It would be a much simpler change to _alpm_resolvedeps.
This is a good idea imho.
It sounds good, but isn't it a bit a step backward after this : http://projects.archlinux.org/?p=pacman.git;a=commit;h=72c0ab5c51d5119b6f81c... However, we were not considering this new behavior in case of unresolvable dependency back then.
Xavier wrote:
On Thu, Jan 15, 2009 at 10:08 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Hey all. I've been thinking about my patches and the concerns that list members have had about them and I've come up with a different way of accomplishing the same thing that may be more palatable:
- Modify the signature of deps.c:_alpm_resolvedeps so that it takes, instead of a list of packages to resolve, a single package. It would return 0 if the resolve succeeded, 1 if the resolve failed, and would additionally return a list of dependencies that this package needs for this sync. In the parlance of _alpm_resolvedeps, it would return the 'pulled' list. _alpm_resolvedeps would, just as it does now, fail immediately when it encountered an unresolvable dependency, and would not require the graph structure to keep track of dependents. It would be a much simpler change to _alpm_resolvedeps.
This is a good idea imho.
It sounds good, but isn't it a bit a step backward after this : http://projects.archlinux.org/?p=pacman.git;a=commit;h=72c0ab5c51d5119b6f81c... However, we were not considering this new behavior in case of unresolvable dependency back then.
I will have to let Nagy comment on this, as I have read but do not really understand the change you are referring to. It looks to me like a big part of his change was factoring out some logic from _alpm_resolvedeps into a separate function (_alpm_resolvedep). What I am proposing does not conflict with this idea, the structure of the functions as they are is still appropriate after my change. He also seems to have fixed some bugs that were due to resolved dependencies not being re-used to satisfy further resolves (I think). But Nagy should give his own thoughts as he understands his change much much better than I do. Thanks, Bryan
It sounds good, but isn't it a bit a step backward after this : http://projects.archlinux.org/?p=pacman.git;a=commit;h=72c0ab5c51d5119b6f81c... However, we were not considering this new behavior in case of unresolvable dependency back then.
We should decide, whether we want this feature or not (before Bryan spend hours with programming something, that we may not want)... On the other hand here is the "should we move to graph structure in the future?" question. If yes, we may not want to make a step back. I don't know, I didn't plan any huge change in deps.c nowadays... Bye
On Thu, Jan 15, 2009 at 6:46 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
It sounds good, but isn't it a bit a step backward after this : http://projects.archlinux.org/?p=pacman.git;a=commit;h=72c0ab5c51d5119b6f81c... However, we were not considering this new behavior in case of unresolvable dependency back then.
We should decide, whether we want this feature or not (before Bryan spend hours with programming something, that we may not want)...
On the other hand here is the "should we move to graph structure in the future?" question. If yes, we may not want to make a step back. I don't know, I didn't plan any huge change in deps.c nowadays...
I'm sorry I've been so silent on all of this- part of it is I don't really know where we do want to head, and adding more prompts and complication just makes me not really want to think about it. I'll leave it to those who have done a lot more work on this section of the code than I to decide whether this is a good idea or not. As long as pacman and libalpm continue to resolve deps in a sane fashion and we don't lose functionality, I am fine. -Dan
Nagy Gabor wrote:
First of all, if you investigate a package 5 times, pacman may ask the user 5 times, whether he wants to install a package from IgnorePkg (if we switch to non-interactive implementation, user will see 5 warnings). This is odd.
However, if we are tricky, we can almost eliminate this problem imho. (Although we must complicate the code a bit.) If you can *successfully* resolve a target package, you _will_ keep all the pulled targets, so you can push them to the ~resolved_targets list (which should be checked first, when you resolve dependencies!*). In real life, we can resolve all targets ~always. Your described problem would appear around unresolvable targets, which is not a common situation.
I think that what you are suggesting is to pass the accumulated "pulled package" list into _alpm_resolvedeps for each subsequent call to that function, so that each subsequent call can first look into the already-resolved list of packages from previous resolves. This is a great addition to my proposal and it eliminates the performance concerns. I will re-implement this way and re-submit my patches. Thanks for your help and your advice. Best wishes, Bryan
participants (4)
-
Bryan Ischo
-
Dan McGee
-
Nagy Gabor
-
Xavier