Re: [pacman-dev] [PATCH] Support parallel download with xfercommand

21 Oct 2020

      On 21/10/20 11:54 am, lesto fante wrote:
...
Hi,
The general idea is to make it possible to have multiple XferCommand
running in parallel.
Rather than trying to keep track of multiple XferCommand, I thought it
would be much easier to let XferCommand to fork/send request to a
daemon and die; then let pacman call a final script `XferLockCommand`
that will block until all download are completed, it will return the
classic 0 on success and -1 on error.
After the introduction of the parallel download, it has been given an
informal greenlight to submit a patch for XferCommand, so here I am.
As I choose simplicity, there is currently no way for pacman to know
how many downloads are happening in the background, their status,
which one did fail, just the final result success/error.
So, you are just passing the full list of files to download to a
download script.  Downloads are not managed by pacman at all?

Just add three more lines to your script:

pacman -Sy
pacman -Sup --noconfirm
<downloads here>
pacman -Su

I don't see the point of implementing parallel XferCommand like that
within pacman at all.
...
I see 2 major slowdown to my downloads;
- small file overhead
- mirror bandwidth
Currently I have a script that will pick all uncommented servers in
pacman list, divide them in groups of 3, and for each group download
one package. This make sure there is only one connection for server (i
assume server will not artificially limit the bandwidth, and if they
do i don't want to bypass their limit) while having multiple file in
download at the same time (good for small file overhead) and full
speed (multiple mirror for each file)
Also from your presentation it seems like ParallelDownloads will hit
only one server; it says sync issue, not really sure what you meant
there, afaik each package is downloaded with full versioning in the
name.
It currently does.  In the future that may change.   At the moment our
download error output is not great, and servers out of sync would result
in a lot of download errors.  We need to add logic to catch bad servers
and exclude them for future downloads, but fixing the output needs to
happen first.
...
...
So if you have an update with 150 package, every single one starts
downloading at the same time?
[...]
Any implementation of this needs to respect the ParallelDownloads
configuration option.
in this patch it is left to the XferCommand/XferLockCommand
implementation. Also the idea is that XferLockCommand may print on
stdout the information relative to the status of the download, and
those relaid back to the user (I may be wrong but this is the current
behaviour); this way the user will not be left wondering what is going
on.
-- Alternative 1 --
Add to XferLockCommands argument the number of maximum parallel
downloads; if the number is reached, the command block.
so the pseudocode became
for each file
        XferCommand  //start one download
        XferLockCommand 10 // block if all 10 download slot are used
    XferLockCommand 0 // special case, block until all download are completed
-- Alternative 2 --
build an array of PID, each one refer to a XferCommand, but I am not
sure how much this would be portable and if there may be issues with
PID reuse; but would give pacman a bit more control on the running
process.
Pacman currently monitors a single download in a portable way.  I see no
reason it could not monitor more than one.  Then it could use
ParallelDownloads and provided some consistency across download methods.
...
...
Why even need this?  A user either has ParallelDownloads set to be
greater than 1, or does not.
As far as I understand from the code in dload.c, ParallelDownloads
does not affect XferCommand, only one XferCommand is running and
expected to complete before the next is run.
It does not...  I'd expect it would after an addition to XferCommand to
support parallel downloads.

Re: [pacman-dev] [PATCH] Support parallel download with xfercommand

Allan McRae