[pacman-dev] pacman-4.2 plans?
The 4.2 roadmap is empty! https://wiki.archlinux.org/index.php?title=DeveloperWiki:Pacman_Roadmap I thought it would be good to discuss what people are planning. So far things I know are being or will be worked on: - Remove of support for PKGBUILDs with only a build() function - Removal of directory symlink support Other things that have starts of patchsets available: - Parallel operations (I think I will take this...) - Optdepends handling Things the world would love to see but need substantial planning: - Hooks Any other ideas that people are working on or plan to? Allan
Allan McRae <allan@archlinux.org> on Fri, 2013/04/12 16:11:
[...] Things the world would love to see but need substantial planning:
- Hooks
Any other ideas that people are working on or plan to?
I would like to see something like etckeeper integrated in pacman. Or is that something hooks are planned for? -- main(a){char*c=/* Schoene Gruesse */"B?IJj;MEH" "CX:;",b;for(a/* Chris get my mail address: */=0;b=c[a++];) putchar(b-1/(/* gcc -o sig sig.c && ./sig */b/42*2-3)*42);}
Hi On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote: > The 4.2 roadmap is empty! > https://wiki.archlinux.org/index.php?title=DeveloperWiki:Pacman_Roadmap > > I thought it would be good to discuss what people are planning. So far > things I know are being or will be worked on: > > - Remove of support for PKGBUILDs with only a build() function > - Removal of directory symlink support > > > Other things that have starts of patchsets available: > > - Parallel operations (I think I will take this...) Are you talking about paralleling CPU extensive operations like integrity checking or you mean parallel in broad sense? Things I keep in mind are: - download package index (list) with multiple threads - download packages in multiple threads - install packages in parallel - parallelize package download and installation. pacman does not need to wait all package downloads, does it? Download is a network bound task and installation is disk bound so doing these things in the same time should reduce total installation time.
On 13/04/13 11:40, Anatol Pomozov wrote:
Hi
On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote:
The 4.2 roadmap is empty! https://wiki.archlinux.org/index.php?title=DeveloperWiki:Pacman_Roadmap
I thought it would be good to discuss what people are planning. So far things I know are being or will be worked on:
- Remove of support for PKGBUILDs with only a build() function - Removal of directory symlink support
Other things that have starts of patchsets available:
- Parallel operations (I think I will take this...) Are you talking about paralleling CPU extensive operations like integrity checking or you mean parallel in broad sense?
I am only considering integrity checking and conflict checking.
Things I keep in mind are: - download package index (list) with multiple threads - download packages in multiple threads
Downloading could be done parallel, but that is a separate issue.
- install packages in parallel - parallelize package download and installation. pacman does not need to wait all package downloads, does it? Download is a network bound task and installation is disk bound so doing these things in the same time should reduce total installation time.
We need all packages downloaded to perform conflict checking. Installing in parallel would be difficult as we need to maintain dependency ordering. Allan
Allan McRae wrote:
We need all packages downloaded to perform conflict checking. Installing in parallel would be difficult as we need to maintain dependency ordering.
Aside from that, wouldn't parallel disk IO be slower due to the write head having to jump back and forth more often? What do you plan to parallelize? Compression and decompression of DBs and packages? Dependency, provider and other metadata resolution?
On 14/04/13 07:49, Xyne wrote:
Allan McRae wrote:
We need all packages downloaded to perform conflict checking. Installing in parallel would be difficult as we need to maintain dependency ordering.
Aside from that, wouldn't parallel disk IO be slower due to the write head having to jump back and forth more often?
What do you plan to parallelize? Compression and decompression of DBs and packages? Dependency, provider and other metadata resolution?
On 13/04/13 12:16, Allan McRae wrote:
I am only considering integrity checking and conflict checking.
Hi On Sat, Apr 13, 2013 at 2:49 PM, Xyne <xyne@archlinux.ca> wrote:
Allan McRae wrote:
We need all packages downloaded to perform conflict checking. Installing in parallel would be difficult as we need to maintain dependency ordering.
Aside from that, wouldn't parallel disk IO be slower due to the write head having to jump back and forth more often?
No, it should not be. Kernel IO scheduler takes care of optimizing block requests order e.g. it merges requests that access adjacent disk sectors and optimizes head path. Also modern disks have NCQ feature that allows to optimize head movement in case of long request queues. SSD allows one IO per bank (if I recall correctly) so having multiple requests per drive (but to different banks) hides IO latency. The rule of thumb is "the longer requests queue - better for throughput but worse for latency". Plus package installation includes other activity (e.g. unpacking archive), intermixing CPU bound and IO bound workload is better for overall throughput. But only real-world testing on rotational disks and SSD could tell us the real gain.
How would pacman's output handle simultaneous downloading and installing without making things really cluttered and difficult to read?
In case of multiple processes the only way to make output readable is to buffer job output and write it to console only after the job (package installation) is finished. Write to console should be serialized by a mutex. For example "tup" build tool [1] does it, try to compile a project with "tup upd -j40" and you'll see that its output is *much* better than for "make -j40".
Installing in parallel would be difficult as we need to maintain dependency ordering.
But packages have this information, and it is possible to build dependency graph, right?
What happens if the network connection is dropped while packages are still downloading after package installation has started?
In this case only those packages that are already downloaded will be installed. And because installation respects dependency order then partial installation will not brick the system, no?
We need all packages downloaded to perform conflict checking. :(
On 14/04/13 09:52, Anatol Pomozov wrote:
What happens if the network connection is dropped while packages are still downloading after package installation has started? In this case only those packages that are already downloaded will be installed. And because installation respects dependency order then partial installation will not brick the system, no?
Sure it can... e.g. an soname bump in ncurses. ncurses gets updated but nothing else, bash is dead. Allan
Allan McRae wrote:
On 14/04/13 09:52, Anatol Pomozov wrote:
What happens if the network connection is dropped while packages are still downloading after package installation has started? In this case only those packages that are already downloaded will be installed. And because installation respects dependency order then partial installation will not brick the system, no?
Sure it can... e.g. an soname bump in ncurses. ncurses gets updated but nothing else, bash is dead.
Allan
You could check for isolated subgraphs of the dependency graph in the downloaded package set and install those together as other packages are still being downloaded. If you don't want to parallelize downloads yet then you can order the download queue by such subsets. Regards, Xyne
On 15/04/13 02:08, Xyne wrote:
Allan McRae wrote:
On 14/04/13 09:52, Anatol Pomozov wrote:
What happens if the network connection is dropped while packages are still downloading after package installation has started? In this case only those packages that are already downloaded will be installed. And because installation respects dependency order then partial installation will not brick the system, no?
Sure it can... e.g. an soname bump in ncurses. ncurses gets updated but nothing else, bash is dead.
Allan
You could check for isolated subgraphs of the dependency graph in the downloaded package set and install those together as other packages are still being downloaded. If you don't want to parallelize downloads yet then you can order the download queue by such subsets.
Sure, we could. But I have no intention of ever doing that because I do not think the complexity worth it. Allan
On Sun, Apr 14, 2013 at 09:58:24AM +1000, Allan McRae wrote:
On 14/04/13 09:52, Anatol Pomozov wrote:
What happens if the network connection is dropped while packages are still downloading after package installation has started? In this case only those packages that are already downloaded will be installed. And because installation respects dependency order then partial installation will not brick the system, no?
Sure it can... e.g. an soname bump in ncurses. ncurses gets updated but nothing else, bash is dead.
Allan
On that note. Directing more effort to get already existing features more wildly used might be more useful than implementing new features/optimizations IMHO. (namely, sodepends and debug packages). But that's an Arch discussion not a Pacman one! Having said that, local queries are the only operations that feel slow to me. So, finally making a decision about how to (fix) this would be great.
On Fri, Apr 12, 2013 at 9:40 PM, Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > Hi > > On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote: >> The 4.2 roadmap is empty! >> https://wiki.archlinux.org/index.php?title=DeveloperWiki:Pacman_Roadmap >> >> I thought it would be good to discuss what people are planning. So far >> things I know are being or will be worked on: >> >> - Remove of support for PKGBUILDs with only a build() function >> - Removal of directory symlink support >> >> >> Other things that have starts of patchsets available: >> >> - Parallel operations (I think I will take this...) > Are you talking about paralleling CPU extensive operations like > integrity checking or you mean parallel in broad sense? Things I keep > in mind are: > - download package index (list) with multiple threads > - download packages in multiple threads > - install packages in parallel > - parallelize package download and installation. pacman does not > need to wait all package downloads, does it? Download is a network > bound task and installation is disk bound so doing these things in the > same time should reduce total installation time. > I think the download and installation steps should be separate from each other. What happens if the network connection is dropped while packages are still downloading after package installation has started? How would pacman's output handle simultaneous downloading and installing without making things really cluttered and difficult to read? Jason
On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote:
- Remove of support for PKGBUILDs with only a build() function
Any chance of delaying that change further? There's a staggeringly large number of packages in the AUR that rely on build() having access to $pkgdir, including some of the most popular packages. From a quick look, over 10 thousand packages would break.
On Wed, Apr 17, 2013 at 07:58:48PM -0700, Aaron DeVore wrote:
On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote:
- Remove of support for PKGBUILDs with only a build() function
Any chance of delaying that change further? There's a staggeringly large number of packages in the AUR that rely on build() having access to $pkgdir, including some of the most popular packages. From a quick look, over 10 thousand packages would break.
I don't see why they should be exempt at all. Every package build should have a package() function. They're making packages, I think, not just building things. Thanks, -- William Giokas | KaiSforza GnuPG Key: 0x73CD09CF Fingerprint: F73F 50EF BBE2 9846 8306 E6B8 6902 06D8 73CD 09CF
That was an vague, so I may be off. The problem remains, though. -Aaron DeVore On Wed, Apr 17, 2013 at 7:58 PM, Aaron DeVore <aaron.devore@gmail.com>wrote:
Any chance of delaying that change further? There's a staggeringly large number of packages in the AUR that rely on build() having access to $pkgdir, including some of the most popular packages. From a quick look, over 10 thousand packages would break.
On 18/04/13 12:58, Aaron DeVore wrote:
On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote:
- Remove of support for PKGBUILDs with only a build() function
Any chance of delaying that change further? There's a staggeringly large number of packages in the AUR that rely on build() having access to $pkgdir, including some of the most popular packages. From a quick look, over 10 thousand packages would break.
Nope. PKGBUILDs should be updated... Allan
Perhaps do a warning for 4.2 and a full removal in 4.3? That's a lot of packages to break/update, especially on a minor version. -Aaron DeVore On Wed, Apr 17, 2013 at 8:13 PM, Allan McRae <allan@archlinux.org> wrote:
On 18/04/13 12:58, Aaron DeVore wrote:
On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote:
- Remove of support for PKGBUILDs with only a build() function
Any chance of delaying that change further? There's a staggeringly large number of packages in the AUR that rely on build() having access to $pkgdir, including some of the most popular packages. From a quick look, over 10 thousand packages would break.
Nope. PKGBUILDs should be updated...
Allan
On 18/04/13 14:45, Aaron DeVore wrote:
Perhaps do a warning for 4.2 and a full removal in 4.3? That's a lot of packages to break/update, especially on a minor version.
You mean like the warning that is printed in makepkg from 4.1?
That's embarrassing. I always use namcap, but I neglected to read the output at the beginning of makepkg's output. Since Pacman 4.2 is apparently set to include removal of build() only packages, perhaps make some more visible announcements ASAP? Looking at the AUR feed, there are updated packages going out that still have build() references to $pkgdir. -Aaron DeVore On Wed, Apr 17, 2013 at 9:46 PM, Allan McRae <allan@archlinux.org> wrote:
On 18/04/13 14:45, Aaron DeVore wrote:
Perhaps do a warning for 4.2 and a full removal in 4.3? That's a lot of packages to break/update, especially on a minor version.
You mean like the warning that is printed in makepkg from 4.1?
On Wed, Apr 17, 2013 at 09:45:12PM -0700, Aaron DeVore wrote:
Perhaps do a warning for 4.2 and a full removal in 4.3? That's a lot of packages to break/update, especially on a minor version.
I believe there is already a warning in 4.1. -- William Giokas | KaiSforza GnuPG Key: 0x73CD09CF Fingerprint: F73F 50EF BBE2 9846 8306 E6B8 6902 06D8 73CD 09CF
On 17/04/13 07:58 PM, Aaron DeVore wrote:
On Thu, Apr 11, 2013 at 11:11 PM, Allan McRae <allan@archlinux.org> wrote:
- Remove of support for PKGBUILDs with only a build() function
Any chance of delaying that change further? There's a staggeringly large number of packages in the AUR that rely on build() having access to $pkgdir, including some of the most popular packages. From a quick look, over 10 thousand packages would break.
Don't you mean over 9 thousand? :P I will get around to fixing my PKGBUILDs this week but I think this is daft. If someone wants to make an improper PKGBUILD now that does everything in one function, isn't it just as easy? All that changes is that this improper function now needs to be called package() instead of build().
participants (9)
-
Aaron DeVore
-
Allan McRae
-
Anatol Pomozov
-
Christian Hesse
-
Connor Behan
-
Jason St. John
-
Mohammad_Alsaleh
-
William Giokas
-
Xyne