[aur-general] Please settle 'base' in 'depends' for all
Okay everyone, every time I ask I get a different answer. According to Dziedzic and Allan 'glibc' does *not* belong in 'depends'. Also Dziedzic votes that *no* package in 'base' should be in 'depends'. Can we settle once and for all what the correct policy is? And then can we update the wiki page and all of these packages http://www.archlinux.org/packages/core/i686/glibc/so that they reflect the policy? --Kaiting. -- Kiwis and Limes: http://kaitocracy.blogspot.com/
On Wed, 19 Jan 2011 00:19:50 -0500, Kaiting Chen wrote:
Okay everyone, every time I ask I get a different answer. According to Dziedzic and Allan 'glibc' does *not* belong in 'depends'. Also Dziedzic votes that *no* package in 'base' should be in 'depends'. Can we settle once and for all what the correct policy is? And then can we update the wiki page and all of these packages http://www.archlinux.org/packages/core/i686/glibc/so that they reflect the policy? --Kaiting.
Every direct dependency needs to be listed in depends even if that for dependencies that are in base. This is important because packages might appear and disappear in the base group and last but not least pacman needs to know in which order packages need to be installed. The last one is especially important for the installer. We also cannot assume that people have installed every package from the base group. This is different for packages from base-devel though. They don't need to be listed as makedepends. Greetings, Pierre -- Pierre Schmitz, https://users.archlinux.de/~pierre
On 19/01/11 15:19, Kaiting Chen wrote:
Okay everyone, every time I ask I get a different answer. According to Dziedzic and Allan 'glibc' does *not* belong in 'depends'. Also Dziedzic votes that *no* package in 'base' should be in 'depends'. Can we settle once and for all what the correct policy is? And then can we update the wiki page and all of these packages http://www.archlinux.org/packages/core/i686/glibc/so that they reflect the policy? --Kaiting.
In general, I think packages in 'base' need listed. Mainly because I do not install a fair number of the base packages and would have even less of them installed if they were not listed as dependencies. However, I think listing 'glibc' in depends is a waste of time. If a system does not have glibc installed, there are worse issues than a missing dependency for one package... If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies. We never know what dependencies will be removed on an update of a package in the dep chain. But we don't do this? Why? Because it means pacman has to make less dependency checks and thus the whole update process is a little faster, and it is more convenient to not have to explicitly list everything. For those same reasons, I see no need to list glibc as a dependency, especially for packages in [extra] and [community]. Allan
On Wed, 19 Jan 2011 17:08:27 +1000 Allan McRae <allan@archlinux.org> wrote:
On 19/01/11 15:19, Kaiting Chen wrote:
Okay everyone, every time I ask I get a different answer. According to Dziedzic and Allan 'glibc' does *not* belong in 'depends'. Also Dziedzic votes that *no* package in 'base' should be in 'depends'. Can we settle once and for all what the correct policy is? And then can we update the wiki page and all of these packages http://www.archlinux.org/packages/core/i686/glibc/so that they reflect the policy? --Kaiting.
In general, I think packages in 'base' need listed. Mainly because I do not install a fair number of the base packages and would have even less of them installed if they were not listed as dependencies.
If we allow users to not (explicitly) install base packages and support such schemes by adding more detailed dependencies, then we could just as well scratch the base group, because it becomes pointless. Actually I would prefer this approach: throw the concept of the base group away, all the *needed* packages will get installed anyway, because they are dependencies for packages the user explictly wants. Dieter
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing: If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't). The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
On Wed, Jan 19, 2011 at 1:20 PM, Thomas Bächler <thomas@archlinux.org> wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing: +1
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
And this check is done by a software not by a "scientist" predicate that varies depending on the experience of maintainer. -- Sébastien Luttringer www.seblu.net
Am 19.01.2011 13:32, schrieb Seblu:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
And this check is done by a software not by a "scientist" predicate that varies depending on the experience of maintainer.
For library-dependencies on binaries, yes. On scripts it is much harder to check this. I don't think it is possible to cover all cases with a piece of software here, but one should try.
On Wed, Jan 19, 2011 at 1:39 PM, Thomas Bächler <thomas@archlinux.org> wrote:
Am 19.01.2011 13:32, schrieb Seblu:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
And this check is done by a software not by a "scientist" predicate that varies depending on the experience of maintainer.
For library-dependencies on binaries, yes. On scripts it is much harder to check this. I don't think it is possible to cover all cases with a piece of software here, but one should try. I was not clear.
I just wanted to support your example and suggest to Allan that it will be better that Pacman do this job, even if, cost is important. IMHO, it's better than pacman take some seconds more to check complex dependency, rather than maintenairs do it manually, based on their time based knownledge of depencies. Pacman is also less subject to human error. With our modern computer, I do not see why the calculation of the dependency graph take more than few seconds -- Sébastien Luttringer www.seblu.net
On Wed, Jan 19, 2011 at 10:57 AM, Seblu <seblu@seblu.net> wrote:
I just wanted to support your example and suggest to Allan that it will be better that Pacman do this job, even if, cost is important. IMHO, it's better than pacman take some seconds more to check complex dependency, rather than maintenairs do it manually, based on their time based knownledge of depencies. Pacman is also less subject to human error.
I think you misunderstood the problem. Pacman already does that (calculate the graph of dependencies) when installing packages. The problem discussed here is the way of informing pacman about that dependencies. The original question was if it is necessary to specify packages of the base group in the depends array of a PKGBUILD. And that is a harder problem to solve, because the packages must know very well the program being packaged, so he decides what is a direct dependency and what is not. There are tools to help with binaries and libraries, but not for non-linkable dependencies (scripts or tools for processing, for example). -- A: Because it obfuscates the reading. Q: Why is top posting so bad? ------------------------------------------- Denis A. Altoe Falqueto Linux user #524555 -------------------------------------------
On Wed, 19 Jan 2011 13:20:58 +0100 Thomas Bächler <thomas@archlinux.org> wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I'm also fan of this. The added correctness (both informational and system robustness) justifies the little overhead, imho. Dieter
On 19/01/11 22:20, Thomas Bächler wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I agree that would be the correct thing to do. In fact, I looked at doing this to the extent of including ever package that a program linked to in its dependencies. This increases the number of dependencies needed for the average package in the repos greatly (from memory it averaged a several fold increase). The side effect of that is there is obviously a correspondingly big increase in the number of dependency checks that pacman needs to do for each update and the associated speed hit. I always assumed that we did not list all dependencies for speed reasons. Allan
On Wed, Jan 19, 2011 at 12:50, Allan McRae <allan@archlinux.org> wrote:
On 19/01/11 22:20, Thomas Bächler wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I agree that would be the correct thing to do. In fact, I looked at doing this to the extent of including ever package that a program linked to in its dependencies. This increases the number of dependencies needed for the average package in the repos greatly (from memory it averaged a several fold increase).
I don't quite understand what you mean, did you add the transitive closure of all dependencies to the package, or did you only add all direct dependencies?
The side effect of that is there is obviously a correspondingly big increase in the number of dependency checks that pacman needs to do for each update and the associated speed hit. I always assumed that we did not list all dependencies for speed reasons.
Well, if the creation of the transitive closure of dependencies is created at package build time, then it can be removed from pacman, that should give a bit of a speed-up I suspect. /M -- Magnus Therning OpenPGP: 0xAB4DFBA4 email: magnus@therning.org jabber: magnus@therning.org twitter: magthe http://therning.org/magnus
On 19/01/11 22:49, Magnus Therning wrote:
On Wed, Jan 19, 2011 at 12:50, Allan McRae<allan@archlinux.org> wrote:
On 19/01/11 22:20, Thomas Bächler wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I agree that would be the correct thing to do. In fact, I looked at doing this to the extent of including ever package that a program linked to in its dependencies. This increases the number of dependencies needed for the average package in the repos greatly (from memory it averaged a several fold increase).
I don't quite understand what you mean, did you add the transitive closure of all dependencies to the package, or did you only add all direct dependencies?
Essentially "readelf -d" on the files and add all needed packages to the dependencies. I.e. list all packages that are directly linked. Its has been many years since I did graph theory... but isn't a "transitive closure" essentially what we have been doing with only listing the top level of dependencies and having them cover the rest? Allan
Am 19.01.2011 14:07, schrieb Allan McRae:
Its has been many years since I did graph theory... but isn't a "transitive closure" essentially what we have been doing with only listing the top level of dependencies and having them cover the rest?
It's the exact opposite. You list all dependencies, and dependencies of dependencies, and ...
On 19/01/11 23:07, Thomas Bächler wrote:
Am 19.01.2011 14:07, schrieb Allan McRae:
Its has been many years since I did graph theory... but isn't a "transitive closure" essentially what we have been doing with only listing the top level of dependencies and having them cover the rest?
It's the exact opposite. You list all dependencies, and dependencies of dependencies, and ...
Ah... OK. then I don't understand this: On 19/01/11 22:49, Magnus Therning wrote:
Well, if the creation of the transitive closure of dependencies is created at package build time, then it can be removed from pacman, that should give a bit of a speed-up I suspect.
When pacman does dependency checks, it checks if the package listed in the dependencies is installed. It does not check if all its dependencies are installed too (as it is assumed that was done at the time the dependency was installed). If we list the transitive closure of dependencies, then pacman has to perform extra checks and so will not give a speed-up. Allan
Am 19.01.2011 14:19, schrieb Allan McRae:
On 19/01/11 23:07, Thomas Bächler wrote:
It's the exact opposite. You list all dependencies, and dependencies of dependencies, and ...
Ah... OK. then I don't understand this:
Don't worry, me neither.
On Wed, 19 Jan 2011 23:19:33 +1000, Allan McRae <allan@archlinux.org> wrote:
Ah... OK. then I don't understand this:
On 19/01/11 22:49, Magnus Therning wrote:
Well, if the creation of the transitive closure of dependencies is created at package build time, then it can be removed from pacman, that should give a bit of a speed-up I suspect.
When pacman does dependency checks, it checks if the package listed in the dependencies is installed. It does not check if all its dependencies are installed too (as it is assumed that was done at the time the dependency was installed). If we list the transitive closure of dependencies, then pacman has to perform extra checks and so will not give a speed-up.
Well, except if you assume that all packages do this perfectly. Then when installing a package with '-S' Pacman can install its dependencies with the equivalent of '-Sd', which will be faster. I find that approach dangerous though... -- Pierre 'catwell' Chapuis
On 19/01/11 23:49, Pierre Chapuis wrote:
On Wed, 19 Jan 2011 23:19:33 +1000, Allan McRae <allan@archlinux.org> wrote:
Ah... OK. then I don't understand this:
On 19/01/11 22:49, Magnus Therning wrote:
Well, if the creation of the transitive closure of dependencies is created at package build time, then it can be removed from pacman, that should give a bit of a speed-up I suspect.
When pacman does dependency checks, it checks if the package listed in the dependencies is installed. It does not check if all its dependencies are installed too (as it is assumed that was done at the time the dependency was installed). If we list the transitive closure of dependencies, then pacman has to perform extra checks and so will not give a speed-up.
Well, except if you assume that all packages do this perfectly. Then when installing a package with '-S' Pacman can install its dependencies with the equivalent of '-Sd', which will be faster.
Huh? How is no dependency checks (-Sd) equivalent to complete dependency checking (-S with a transitive closure of dependencies)? They are polar opposites. Allan
On Wed, 19 Jan 2011 23:59:55 +1000, Allan McRae <allan@archlinux.org> wrote:
Huh? How is no dependency checks (-Sd) equivalent to complete dependency checking (-S with a transitive closure of dependencies)? They are polar opposites.
What I mean is that if a transitive closure of dependencies is performed at packaging time, then there is no need to check for dependencies when installing the original package. Here is an example: A depends on B and D B depends on C C depends on D and E Currently the deps will be: A -> B,D B -> C C -> D,E When installing A, Pacman will: 1) check deps for A, start installing B and D 2) check deps for B and D, start installing C 3) check deps for C, start installing E With a transitive closure scheme at packaging time, the deps would be: A -> B,C,D,E B -> C,D,E C -> D,E When installing A, Pacman could simply install B, C, D and E *without* checking their deps (-Sd) because these deps are necessarily already included in those for A. -- Pierre 'catwell' Chapuis
On Wed, Jan 19, 2011 at 2:07 PM, Pierre Chapuis <catwell@archlinux.us>wrote:
Here is an example:
A depends on B and D B depends on C C depends on D and E
Currently the deps will be:
A -> B,D B -> C C -> D,E
When installing A, Pacman will:
1) check deps for A, start installing B and D 2) check deps for B and D, start installing C 3) check deps for C, start installing E
With a transitive closure scheme at packaging time, the deps would be:
A -> B,C,D,E B -> C,D,E C -> D,E
When installing A, Pacman could simply install B, C, D and E *without* checking their deps (-Sd) because these deps are necessarily already included in those for A.
-- Pierre 'catwell' Chapuis
If B and D are already installed, the current implementation does 2 dependency checks and trasitive closure makes 4. Also, does not pacman need to know that C, D and E must be installed before B? Your approach would probably install B before its dependencies.
Le 19 janvier 2011 09:07:33, Pierre Chapuis a écrit :
On Wed, 19 Jan 2011 23:59:55 +1000, Allan McRae <allan@archlinux.org>
wrote:
Huh? How is no dependency checks (-Sd) equivalent to complete dependency checking (-S with a transitive closure of dependencies)? They are polar opposites.
What I mean is that if a transitive closure of dependencies is performed at packaging time, then there is no need to check for dependencies when installing the original package.
Here is an example:
A depends on B and D B depends on C C depends on D and E
Currently the deps will be:
A -> B,D B -> C C -> D,E
When installing A, Pacman will:
1) check deps for A, start installing B and D 2) check deps for B and D, start installing C 3) check deps for C, start installing E
With a transitive closure scheme at packaging time, the deps would be:
A -> B,C,D,E B -> C,D,E C -> D,E
When installing A, Pacman could simply install B, C, D and E *without* checking their deps (-Sd) because these deps are necessarily already included in those for A.
As the maintainer of A, it is not your job to track dependencies of B and D. Again, look at the problem from a different point of view. If tomorrow dependencies of B change to B -> C F (direct dependecies) does it mean that A (and **all** other pkgs that depends on B) should be updated to include a dependecy on F ? What if dependency on E is removed from C PKGBUILD ? Maintaining a package with such rules will be a nightmare. Stéphane
On 19 January 2011 22:23, Stéphane Gaudreault <stephane@archlinux.org> wrote:
As the maintainer of A, it is not your job to track dependencies of B and D.
Again, look at the problem from a different point of view. If tomorrow dependencies of B change to
B -> C F (direct dependecies)
does it mean that A (and **all** other pkgs that depends on B) should be updated to include a dependecy on F ? What if dependency on E is removed from C PKGBUILD ? Maintaining a package with such rules will be a nightmare.
If A depends on B AND C while B depends on C, then it is correct to list both B and C as dependencies of A. That is as proper as it gets, and this most of us do not practise currently. Aside from any technical disadvantage, it is just clutter to my eyes. It is also correct to list only B as a dependency of A, since B would pull in C, but this correctness is only assumed correctness, provided link-level dependency has not been checked. This, most of us do currently. PKGBUILDs and pacman dependency lists are easy to look at. I don't see a need to 'settle' this one. You may not list glibc because it simply makes no sense to not have it at the time of installation. It can be as far deep down as F, but ultimately it is the packagers' (and community's) responsibility to incorporate dependency changes. As a community, changes like this are hard to miss. C getting out of B's dependency chain does not happen a lot. When it does, someone can report it. So we (should) stick to the simple(r) way - the current way.
On Wed, Jan 19, 2011 at 10:19 AM, Ray Rashif <schiv@archlinux.org> wrote:
I don't see a need to 'settle' this one. You may not list glibc because it simply makes no sense to not have it at the time of installation. It can be as far deep down as F, but ultimately it is the packagers' (and community's) responsibility to incorporate dependency changes. As a community, changes like this are hard to miss. C getting out of B's dependency chain does not happen a lot. When it does, someone can report it. So we (should) stick to the simple(r) way - the current way.
Um the current way is that everyone disagrees about what to do and then just does whatever they feel like doing. --Kaiting. -- Kiwis and Limes: http://kaitocracy.blogspot.com/
On Wed, Jan 19, 2011 at 1:15 PM, Kaiting Chen <kaitocracy@gmail.com> wrote:
On Wed, Jan 19, 2011 at 10:19 AM, Ray Rashif <schiv@archlinux.org> wrote:
I don't see a need to 'settle' this one. You may not list glibc because it simply makes no sense to not have it at the time of installation. It can be as far deep down as F, but ultimately it is the packagers' (and community's) responsibility to incorporate dependency changes. As a community, changes like this are hard to miss. C getting out of B's dependency chain does not happen a lot. When it does, someone can report it. So we (should) stick to the simple(r) way - the current way.
Um the current way is that everyone disagrees about what to do and then just does whatever they feel like doing. --Kaiting.
-- Kiwis and Limes: http://kaitocracy.blogspot.com/
In my opinion, we need to create another group called 'base-core' which will be guaranteed to be installed like base-devel is guaranteed to be installed when making a package. I think this should be the way to go because a lot of packages in base aren't really needed in a minimal system.
On 20/01/11 00:07, Pierre Chapuis wrote:
On Wed, 19 Jan 2011 23:59:55 +1000, Allan McRae <allan@archlinux.org> wrote:
Huh? How is no dependency checks (-Sd) equivalent to complete dependency checking (-S with a transitive closure of dependencies)? They are polar opposites.
What I mean is that if a transitive closure of dependencies is performed at packaging time, then there is no need to check for dependencies when installing the original package.
Here is an example:
A depends on B and D B depends on C C depends on D and E
Currently the deps will be:
A -> B,D B -> C C -> D,E
When installing A, Pacman will:
1) check deps for A, start installing B and D 2) check deps for B and D, start installing C 3) check deps for C, start installing E
With a transitive closure scheme at packaging time, the deps would be:
A -> B,C,D,E B -> C,D,E C -> D,E
When installing A, Pacman could simply install B, C, D and E *without* checking their deps (-Sd) because these deps are necessarily already included in those for A.
The problem is that the transitive closure can not be assumed to be correct. e.g. At the time A is built: A -> B,C,D,E B -> C,D,E C -> D,E Then B is updated and B -> C,D,E,F. Now the assuming a transitive closure for the dependency list for A is incorrect. Installing the listed dependencies of A with the equivalent of -Sd would result in F not being installed which would break A through broken B. So either: 1) we require a largely unnecessary rebuild of A 2) we always check the dependencies of uninstalled dependencies. Note #2 is less burden on packagers and is more efficient in the examples given above if both B and D are installed (two checks vs four), and that will be the case for most system updates. When none of A - E are installed, they are probably equally efficient. Allan
On Wed, Jan 19, 2011 at 13:07, Allan McRae <allan@archlinux.org> wrote:
On 19/01/11 22:49, Magnus Therning wrote:
On Wed, Jan 19, 2011 at 12:50, Allan McRae<allan@archlinux.org> wrote:
On 19/01/11 22:20, Thomas Bächler wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I agree that would be the correct thing to do. In fact, I looked at doing this to the extent of including ever package that a program linked to in its dependencies. This increases the number of dependencies needed for the average package in the repos greatly (from memory it averaged a several fold increase).
I don't quite understand what you mean, did you add the transitive closure of all dependencies to the package, or did you only add all direct dependencies?
Essentially "readelf -d" on the files and add all needed packages to the dependencies. I.e. list all packages that are directly linked.
Its has been many years since I did graph theory... but isn't a "transitive closure" essentially what we have been doing with only listing the top level of dependencies and having them cover the rest?
Nope, it's the "opposite": • A depends on B • B depends on C If the PKGBUILD for A lists the transitive closure, then it would have depends=(B C) As we do now the transitive closure is calculated by pacman in order to make sure all dependencies are installed. /M -- Magnus Therning OpenPGP: 0xAB4DFBA4 email: magnus@therning.org jabber: magnus@therning.org twitter: magthe http://therning.org/magnus
On 19/01/11 23:09, Magnus Therning wrote:
On Wed, Jan 19, 2011 at 13:07, Allan McRae<allan@archlinux.org> wrote:
On 19/01/11 22:49, Magnus Therning wrote:
On Wed, Jan 19, 2011 at 12:50, Allan McRae<allan@archlinux.org> wrote:
On 19/01/11 22:20, Thomas Bächler wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I agree that would be the correct thing to do. In fact, I looked at doing this to the extent of including ever package that a program linked to in its dependencies. This increases the number of dependencies needed for the average package in the repos greatly (from memory it averaged a several fold increase).
I don't quite understand what you mean, did you add the transitive closure of all dependencies to the package, or did you only add all direct dependencies?
Essentially "readelf -d" on the files and add all needed packages to the dependencies. I.e. list all packages that are directly linked.
Its has been many years since I did graph theory... but isn't a "transitive closure" essentially what we have been doing with only listing the top level of dependencies and having them cover the rest?
Nope, it's the "opposite":
• A depends on B • B depends on C
If the PKGBUILD for A lists the transitive closure, then it would have
depends=(B C)
As we do now the transitive closure is calculated by pacman in order to make sure all dependencies are installed.
Nope. We currently list depends=(B) and pacman just checks B is installed. Allan
On Wed, Jan 19, 2011 at 13:21, Allan McRae <allan@archlinux.org> wrote:
On 19/01/11 23:09, Magnus Therning wrote:
On Wed, Jan 19, 2011 at 13:07, Allan McRae<allan@archlinux.org> wrote:
On 19/01/11 22:49, Magnus Therning wrote:
On Wed, Jan 19, 2011 at 12:50, Allan McRae<allan@archlinux.org> wrote:
On 19/01/11 22:20, Thomas Bächler wrote:
Am 19.01.2011 08:08, schrieb Allan McRae: > > If we want to be really pedantic about dependencies, we should list > _ALL_ dependencies and not remove the ones that are dependencies of > dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I agree that would be the correct thing to do. In fact, I looked at doing this to the extent of including ever package that a program linked to in its dependencies. This increases the number of dependencies needed for the average package in the repos greatly (from memory it averaged a several fold increase).
I don't quite understand what you mean, did you add the transitive closure of all dependencies to the package, or did you only add all direct dependencies?
Essentially "readelf -d" on the files and add all needed packages to the dependencies. I.e. list all packages that are directly linked.
Its has been many years since I did graph theory... but isn't a "transitive closure" essentially what we have been doing with only listing the top level of dependencies and having them cover the rest?
Nope, it's the "opposite":
• A depends on B • B depends on C
If the PKGBUILD for A lists the transitive closure, then it would have
depends=(B C)
As we do now the transitive closure is calculated by pacman in order to make sure all dependencies are installed.
Nope. We currently list depends=(B) and pacman just checks B is installed.
All right, I need to clarify. If B *isn't* installed, then pacman will install both B *and* C; and there's the transitive closure. /M -- Magnus Therning OpenPGP: 0xAB4DFBA4 email: magnus@therning.org jabber: magnus@therning.org twitter: magthe http://therning.org/magnus
Le 19 janvier 2011 08:07:00, Allan McRae a écrit :
On 19/01/11 22:49, Magnus Therning wrote:
On Wed, Jan 19, 2011 at 12:50, Allan McRae<allan@archlinux.org> wrote:
On 19/01/11 22:20, Thomas Bächler wrote:
Am 19.01.2011 08:08, schrieb Allan McRae:
If we want to be really pedantic about dependencies, we should list _ALL_ dependencies and not remove the ones that are dependencies of dependencies.
Why don't we just do the correct thing:
If package A depends on package B, and B depends on C, then A might depend on C explicitly because it accesses C directly. Or it might only depend on indirectly C because B accesses C. We should reflect that in dependencies (in the first case, A depends on C, in the second case it doesn't).
The result is this: Whenever the dependencies of B change (e.g., C is removed), A will still work correctly.
I agree that would be the correct thing to do. In fact, I looked at doing this to the extent of including ever package that a program linked to in its dependencies. This increases the number of dependencies needed for the average package in the repos greatly (from memory it averaged a several fold increase).
I don't quite understand what you mean, did you add the transitive closure of all dependencies to the package, or did you only add all direct dependencies?
Essentially "readelf -d" on the files and add all needed packages to the dependencies. I.e. list all packages that are directly linked.
Its has been many years since I did graph theory... but isn't a "transitive closure" essentially what we have been doing with only listing the top level of dependencies and having them cover the rest?
Allan
I think Allan is right here. If we look at the problem from another angle, thing are very simple : 1) There is a groupe of packages that are required. Theses packages are necessary for the proper functioning of the system (eg. a kernel, a boot loader, initscript, glibc, etc). The system will not run well or be usable without will be here. It is not necessary for other package to depends on them, because they **should** be there (although it does not hurt if a package depends on them). 2) Starting from this base, package A depends on Package B if B absolutely must be installed in order to run A. In some cases, A depends not only on B, but on a version of B. In this case, the version dependency is usually a lower limit, in the sense that A depends on any version of B more recent than some specified version. This gives a simple receipie : When you want to list the dependency fo a package, simply look at what is directly used (for binary it is essentially "readelf -d" on the files) and you get the dependency list for your package. You can then assume that everything will be correct as maintainers of the listed packages did the same up to the required group. If there is something missing in the dependencies of your dependencies send a bug report. Stéphane
On Wed, Jan 19, 2011 at 2:30 PM, Stéphane Gaudreault <stephane@archlinux.org
wrote:
This gives a simple receipie : When you want to list the dependency fo a package, simply look at what is directly used (for binary it is essentially "readelf -d" on the files) and you get the dependency list for your package. You can then assume that everything will be correct as maintainers of the listed packages did the same up to the required group.
It means then that if we have this (dependency are direct dependencies): - Package A: depends=(B C) - Package B: depends=(C) C should *not* be removed from the dependency array of A. -- Cédric Girard
Le 19 janvier 2011 08:36:04, Cédric Girard a écrit :
On Wed, Jan 19, 2011 at 2:30 PM, Stéphane Gaudreault <stephane@archlinux.org
wrote:
This gives a simple receipie : When you want to list the dependency fo a package, simply look at what is directly used (for binary it is essentially "readelf -d" on the files) and you get the dependency list for your package. You can then assume that everything will be correct as maintainers of the listed packages did the same up to the required group.
It means then that if we have this (dependency are direct dependencies): - Package A: depends=(B C) - Package B: depends=(C)
C should *not* be removed from the dependency array of A.
Exact, you list what you directly use. It also means that if we have this (dependency are direct dependencies): - Package A: depends=(B C) - Package B: depends=(D) Maintainer of package A should not worry about dependency of B unless something is broken. Then simply fill a bug report. Stéphane
Cédric Girard wrote:
It means then that if we have this (dependency are direct dependencies): - Package A: depends=(B C) - Package B: depends=(C)
C should *not* be removed from the dependency array of A.
I agree with this. A package should list as its dependencies any package on which it directly depends, and only those packages. There is no need to create a "core" group for dependency resolution, e.g. one that would include the kernel and other necessary packages for a minimal system.* Those aren't direct dependencies, as the chroot example shows. Of course, any package that e.g. requires kernel headers should to list linux-api-headers as a dep, even if it is a dep of glibc and generally assumed to be on the system. The argument that some packages are guaranteed to be on the system and thus checking for them is just a waste of time is a bad one. Checking assures correctness of the dependency graph, and the overhead is negligible because if the package is installed, as it almost always will be, then all pacman has to do is check that the package is listed in the local db, which probably amounts to testing for membership in an internal hash or list of less than 1000 members. If everyone were to use implicit dependencies then pacman would fail because no package would specify the required dependency. A rule that would break the system if it were followed by everyone is a bad rule. Expecting some to follow it and others not to and just hoping that everyone will keep working is simply bad practice. It's not minimalist... it's just lazy. * It might be useful to have a group that install a minimalist system and pacman, i.e. the smallest package set that can boot to a prompt and let the user install packages.
On 21/01/11 22:38, Xyne wrote:
If everyone were to use implicit dependencies then pacman would fail because no package would specify the required dependency. A rule that would break the system if it were followed by everyone is a bad rule. Expecting some to follow it and others not to and just hoping that everyone will keep working is simply bad practice. It's not minimalist... it's just lazy.
I pointed out that hard rules are not good. e.g. coreutils should (and does) depend on glibc as it is not guaranteed that glibc is installed at the time when you first install coreutils (which is likely the initial install). But there is no point putting glibc in the depends list for (e.g.) openoffice-base as it will be installed by that stage. Two points to consider: 1) How much more complicated would it be to list all dependencies?
readelf -d $(pacman -Qql openoffice-base) 2>/dev/null | grep NEEDED | sort | uniq | wc -l 150
That is a lot of libraries... although some will be in the same package so that is an upper estimate. But that is only libraries and the complete dep list will be longer than that. 2) It is worth the effort? We have very few bug reports about missing dependencies and most (all?) of those fall into the category of missed soname bumps or due to people not building in chroots. I.e. these are because of poor packaging and not because we make assumptions about what packages are installed or the dependencies of dependencies. So I see making a change to the current approach as making things (1) more complicated for (2) no real benefit. Allan
Allan McRae wrote:
I pointed out that hard rules are not good. e.g. coreutils should (and does) depend on glibc as it is not guaranteed that glibc is installed at the time when you first install coreutils (which is likely the initial install). But there is no point putting glibc in the depends list for (e.g.) openoffice-base as it will be installed by that stage.
That's irrelevant to this discussion because it's a bootstrapping issue. I don't know how cyclical dependencies and other initialization problems should be handled, but they constitute a special case that should be detected and dealt with separately.
Two points to consider: 1) How much more complicated would it be to list all dependencies?
readelf -d $(pacman -Qql openoffice-base) 2>/dev/null | grep NEEDED | sort | uniq | wc -l 150
That is a lot of libraries... although some will be in the same package so that is an upper estimate. But that is only libraries and the complete dep list will be longer than that.
I agree that is a lot. Of course we can't reasonably expect a packager to manually enter 150 libraries into a PKGBUILD, but are all of those direct dependencies? Maybe this is a silly question due to my ignorance of linking, but are any of those libraries linked via other packages? For example, if bar provides code that links to baz, and foo builds against bar, would baz turn up in the readelf output for foo? If the answer is yes, then baz would not be a dep of foo, even if foo links to it, because the linking was established "indirectly", i.e. bar could have used something else. Of course, in that case, baz would be a strict runtime dependency (unless sodeps could resolve this, but again, my understanding here is limited), but from a graph-theory pov, foo would only depend on bar. Such a situation would only require a rebuild, just as it would now (i.e. if baz were replaced by something else).
2) It is worth the effort? We have very few bug reports about missing dependencies and most (all?) of those fall into the category of missed soname bumps or due to people not building in chroots. I.e. these are because of poor packaging and not because we make assumptions about what packages are installed or the dependencies of dependencies.
So I see making a change to the current approach as making things (1) more complicated for (2) no real benefit.
The answer depends on the answer to my previous question. The current system does indeed work, but it provides no strict guarantees. I think good practice in general is to make something that is critical as reliable and future-proof as possible, and I see that as a true benefit. It's like wanting to agree upon an open specification instead of just letting everyone do it their way and hope for a triumph of common sense. Admittedly, I doubt it would be a problem in the future and I'm discussing this idealistically. As for complication, even if there were a large number of deps to consider, there would likely be ways to generate at least a tentative list using simple tools. It could then be refined through feedback.*
On 22/01/11 00:43, Xyne wrote:
Allan McRae wrote:
I pointed out that hard rules are not good. e.g. coreutils should (and does) depend on glibc as it is not guaranteed that glibc is installed at the time when you first install coreutils (which is likely the initial install). But there is no point putting glibc in the depends list for (e.g.) openoffice-base as it will be installed by that stage.
That's irrelevant to this discussion because it's a bootstrapping issue. I don't know how cyclical dependencies and other initialization problems should be handled, but they constitute a special case that should be detected and dealt with separately.
Isn't this exactly the issue here? The original question was whether we should include glibc in the dependency list for a package. I pointed out a case where including glibc in the depends is critical and a case where it is a waste of time, indicating there is no one answer to such a question.
Two points to consider: 1) How much more complicated would it be to list all dependencies?
readelf -d $(pacman -Qql openoffice-base) 2>/dev/null | grep NEEDED | sort | uniq | wc -l 150
That is a lot of libraries... although some will be in the same package so that is an upper estimate. But that is only libraries and the complete dep list will be longer than that.
I agree that is a lot. Of course we can't reasonably expect a packager to manually enter 150 libraries into a PKGBUILD, but are all of those direct dependencies? Maybe this is a silly question due to my ignorance of linking, but are any of those libraries linked via other packages? For example, if bar provides code that links to baz, and foo builds against bar, would baz turn up in the readelf output for foo? If the answer is yes, then baz would not be a dep of foo, even if foo links to it, because the linking was established "indirectly", i.e. bar could have used something else.
Of course, in that case, baz would be a strict runtime dependency (unless sodeps could resolve this, but again, my understanding here is limited), but from a graph-theory pov, foo would only depend on bar. Such a situation would only require a rebuild, just as it would now (i.e. if baz were replaced by something else).
The answer is no. "readelf -d" only lists directly linked libraries. "ldd" gives the entire link chain.
2) It is worth the effort? We have very few bug reports about missing dependencies and most (all?) of those fall into the category of missed soname bumps or due to people not building in chroots. I.e. these are because of poor packaging and not because we make assumptions about what packages are installed or the dependencies of dependencies.
So I see making a change to the current approach as making things (1) more complicated for (2) no real benefit.
The answer depends on the answer to my previous question. The current system does indeed work, but it provides no strict guarantees. I think good practice in general is to make something that is critical as reliable and future-proof as possible, and I see that as a true benefit. It's like wanting to agree upon an open specification instead of just letting everyone do it their way and hope for a triumph of common sense.
Admittedly, I doubt it would be a problem in the future and I'm discussing this idealistically.
Idealistically, I might even agree with you. I just think it is not a practical thing to do. And if we move away from just using binary libraries as examples, we can find situation where we can make guarantees that a dep of a dep will never be removed. e.g. I have a some perl software that depends on the perl-foo module. If we list all dependencies, I would need to list perl and perl-foo. But I can guarantee that perl-foo will always depend on perl, so do I really need to list perl as a dep?
As for complication, even if there were a large number of deps to consider, there would likely be ways to generate at least a tentative list using simple tools. It could then be refined through feedback.*
Again, I think this is too idealistic. Our current dependency checking tool (namcap) has a lot of issues determining the dependencies of a package and it is not be refined much at all... Allan
On 2011-01-22 01:29 +1000 (03:6) Allan McRae wrote:
On 22/01/11 00:43, Xyne wrote:
Allan McRae wrote:
I pointed out that hard rules are not good. e.g. coreutils should (and does) depend on glibc as it is not guaranteed that glibc is installed at the time when you first install coreutils (which is likely the initial install). But there is no point putting glibc in the depends list for (e.g.) openoffice-base as it will be installed by that stage.
That's irrelevant to this discussion because it's a bootstrapping issue. I don't know how cyclical dependencies and other initialization problems should be handled, but they constitute a special case that should be detected and dealt with separately.
Isn't this exactly the issue here? The original question was whether we should include glibc in the dependency list for a package. I pointed out a case where including glibc in the depends is critical and a case where it is a waste of time, indicating there is no one answer to such a question.
Sorry, I misread your reply. Ignore what I wrote.
Two points to consider: 1) How much more complicated would it be to list all dependencies?
readelf -d $(pacman -Qql openoffice-base) 2>/dev/null | grep NEEDED | sort | uniq | wc -l 150
That is a lot of libraries... although some will be in the same package so that is an upper estimate. But that is only libraries and the complete dep list will be longer than that.
I agree that is a lot. Of course we can't reasonably expect a packager to manually enter 150 libraries into a PKGBUILD, but are all of those direct dependencies? Maybe this is a silly question due to my ignorance of linking, but are any of those libraries linked via other packages? For example, if bar provides code that links to baz, and foo builds against bar, would baz turn up in the readelf output for foo? If the answer is yes, then baz would not be a dep of foo, even if foo links to it, because the linking was established "indirectly", i.e. bar could have used something else.
Of course, in that case, baz would be a strict runtime dependency (unless sodeps could resolve this, but again, my understanding here is limited), but from a graph-theory pov, foo would only depend on bar. Such a situation would only require a rebuild, just as it would now (i.e. if baz were replaced by something else).
The answer is no. "readelf -d" only lists directly linked libraries. "ldd" gives the entire link chain.
So if I wrote bindings to libalpm in Haskell (haskell-libalpm) and then created a package with a binary that used those bindings (foo), then readelf's output would not indicate libalpm?
2) It is worth the effort? We have very few bug reports about missing dependencies and most (all?) of those fall into the category of missed soname bumps or due to people not building in chroots. I.e. these are because of poor packaging and not because we make assumptions about what packages are installed or the dependencies of dependencies.
So I see making a change to the current approach as making things (1) more complicated for (2) no real benefit.
The answer depends on the answer to my previous question. The current system does indeed work, but it provides no strict guarantees. I think good practice in general is to make something that is critical as reliable and future-proof as possible, and I see that as a true benefit. It's like wanting to agree upon an open specification instead of just letting everyone do it their way and hope for a triumph of common sense.
Admittedly, I doubt it would be a problem in the future and I'm discussing this idealistically.
Idealistically, I might even agree with you. I just think it is not a practical thing to do. And if we move away from just using binary libraries as examples, we can find situation where we can make guarantees that a dep of a dep will never be removed.
e.g. I have a some perl software that depends on the perl-foo module. If we list all dependencies, I would need to list perl and perl-foo. But I can guarantee that perl-foo will always depend on perl, so do I really need to list perl as a dep?
Well, I list perl as a dep even for packages that depend on perl-* packages, but foo-* packages are a special case in that they necessarily depend on foo further down the chain. The same cannot be said for other implicit dependencies.
As for complication, even if there were a large number of deps to consider, there would likely be ways to generate at least a tentative list using simple tools. It could then be refined through feedback.*
Again, I think this is too idealistic. Our current dependency checking tool (namcap) has a lot of issues determining the dependencies of a package and it is not be refined much at all...
Maybe not "simple" tools then, but it should be possible to create something that can generate a list, e.g. a sandbox tool that can determine which libraries etc are accessed at build time and run time, which could then be mapped back to package. Forget that for a moment though and answer this instead: Can you think of any way other than direct specification that would guarantee that all dependencies are installed with a package (presuming that we know exactly what a package depends on). E.g. if a package depends on foo and foo-bar, then foo-bar clearly suffices, but how would you formally guarantee something such as glibc?
On 22/01/11 01:57, Xyne wrote:
So if I wrote bindings to libalpm in Haskell (haskell-libalpm) and then created a package with a binary that used those bindings (foo), then readelf's output would not indicate libalpm?
Short answer is probably not... especially if you use -Wl,--as-needed. Looking at the "readelf -d" output for pacman and libalpm.so might be informative to understanding this. <snip>
Forget that for a moment though and answer this instead: Can you think of any way other than direct specification that would guarantee that all dependencies are installed with a package (presuming that we know exactly what a package depends on). E.g. if a package depends on foo and foo-bar, then foo-bar clearly suffices, but how would you formally guarantee something such as glibc?
If an Arch system can natively install packages with pacman, I then can make the guarantee that glibc on that system. Allan
On 2011-01-19 at 08:30:14 -0500, Stéphane Gaudreault wrote:
1) There is a groupe of packages that are required. Theses packages are necessary for the proper functioning of the system (eg. a kernel, a boot loader, initscript, glibc, etc). The system will not run well or be usable without will be here. It is not necessary for other package to depends on them, because they **should** be there (although it does not hurt if a package depends on them).
kernel and bootloader aren't really required e.g. in a chroot; I don't think that such a list would include much beside glibc and pacman with its dependencies (otherwise it wouldn't be arch) -- Elena ``of Valhalla'' homepage: http://www.trueelena.org
On Thu, 2011-01-20 at 11:29 +0100, Elena ``of Valhalla'' wrote:
On 2011-01-19 at 08:30:14 -0500, Stéphane Gaudreault wrote:
1) There is a groupe of packages that are required. Theses packages are necessary for the proper functioning of the system (eg. a kernel, a boot loader, initscript, glibc, etc). The system will not run well or be usable without will be here. It is not necessary for other package to depends on them, because they **should** be there (although it does not hurt if a package depends on them).
kernel and bootloader aren't really required e.g. in a chroot; I don't think that such a list would include much beside glibc and pacman with its dependencies (otherwise it wouldn't be arch)
That depends, if you're building (for example) external kernel modules the kernel would of course be required.
As a reference, redhat/fedora have this same problem, the packages which need not be included as deps are the packages used when creating the chroot on the fedora build server, koji, This list is very short, give me a minute and I will dig it up, but it is only say 10 packages long. Personally though, my vote is for base devel and a subset of base.
Le 19/01/2011 07:19, Kaiting Chen a écrit :
Okay everyone, every time I ask I get a different answer. According to Dziedzic and Allan 'glibc' does *not* belong in 'depends'. Also Dziedzic votes that *no* package in 'base' should be in 'depends'. Can we settle once and for all what the correct policy is? And then can we update the wiki page and all of these packages http://www.archlinux.org/packages/core/i686/glibc/so that they reflect the policy? --Kaiting.
Verbosity ftw! :) But... I don't get a vote, do I :/ -- cantabile "Jayne is a girl's name." -- River
participants (19)
-
Allan McRae
-
cantabile
-
Cédric Girard
-
Denis A. Altoé Falqueto
-
Dieter Plaetinck
-
Elena ``of Valhalla''
-
Joao Cordeiro
-
Kaiting Chen
-
Magnus Therning
-
Ng Oon-Ee
-
Pierre Chapuis
-
Pierre Schmitz
-
Ray Rashif
-
Seblu
-
Stéphane Gaudreault
-
Thomas Bächler
-
Thomas Dziedzic
-
Thomas S Hatch
-
Xyne