When downloading small files the download time seems dominated by connection latency rather than bandwidth. Downloading several small files in parallel is an effective way to reduce the impact of latency. However, downloading many small files in parallel does not always saturate bandwidth, which is inefficient. This commit attempts to fully utilize parallelism to download small files while maintaining high bandwidth utilization. This is accomplished by downloading the smallest files first using all but 1 parallel connection while downloading a large file using the remaining parallel connection. The result seems to maintain a more stable download speed throughout entire transactions. This is in contrast to the usual behavior I observed when downloading many packages, where the download speed progressively declines as smaller packages are downloaded. When my entire cache is deleted and all packages are redownloaded using ` pacman -Qqn | sudo pacman -Sw -`, the mean download speed is 47.8 MiB/s. After this patch, the mean download speed is 54.0 MiB/s. In terms of time savings, this patch causes a 14.9 GiB download to go from 5 minutes 20 seconds to 4 minutes 43 seconds on my system and network. Your mileage may vary on different systems, networks, and selections of packages. I expect there to be virtually no effect on selections of packages that are all fairly large. Signed-off-by: Thomas Reichel <tom.p.reichel@gmail.com> --- lib/libalpm/dload.c | 39 +++++++++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c index 7c344760..4d8777f9 100644 --- a/lib/libalpm/dload.c +++ b/lib/libalpm/dload.c @@ -884,7 +884,7 @@ cleanup: } /* - * Use to sort payloads by max size in decending order (largest -> smallest) + * Use to sort payloads by max size in ascending order (smallest -> largest) */ static int compare_dload_payload_sizes(const void *left_ptr, const void *right_ptr) { @@ -893,7 +893,7 @@ static int compare_dload_payload_sizes(const void *left_ptr, const void *right_p left = (struct dload_payload *) left_ptr; right = (struct dload_payload *) right_ptr; - return right->max_size - left->max_size; + return left->max_size - right->max_size; } /* Returns -1 if an error happened for a required file @@ -910,18 +910,39 @@ static int curl_download_internal(alpm_handle_t *handle, int updated = 0; /* was a file actually updated */ CURLM *curlm = handle->curlm; size_t payloads_size = alpm_list_count(payloads); + /* download a single large file along with smaller files */ + struct dload_payload *large_file = NULL; + /* a reverse iterator for payloads */ + alpm_list_t* payloads_reverse; /* Sort payloads by package size */ payloads = alpm_list_msort(payloads, payloads_size, &compare_dload_payload_sizes); + payloads_reverse = alpm_list_last(payloads); + while(active_downloads_num > 0 || payloads) { CURLMcode mc; for(; active_downloads_num < max_streams && payloads; active_downloads_num++) { - struct dload_payload *payload = payloads->data; + struct dload_payload *payload; + if (large_file == NULL) { + payload = payloads_reverse->data; + } else { + payload = payloads->data; + } if(curl_add_payload(handle, curlm, payload, localpath) == 0) { - payloads = payloads->next; + if (payloads_reverse == payloads) { + /* pointers met in the middle, we finished by advancing either */ + payloads = NULL; + break; + } + if (large_file == NULL) { + large_file = payloads_reverse->data; + payloads_reverse = payloads_reverse->prev; + } else { + payloads = payloads->next; + } } else { /* The payload failed to start. Do not start any new downloads. * Wait until all active downloads complete. @@ -959,6 +980,16 @@ static int curl_download_internal(alpm_handle_t *handle, payloads = NULL; err = -1; } else if(ret == 0) { + struct dload_payload *finished_payload; + CURLcode curlerr; + /* which file downloaded? */ + curlerr = curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE, &finished_payload); + ASSERT(curlerr == CURLE_OK, RET_ERR(handle, ALPM_ERR_LIBCURL, -1)); + /* did we just finish the large file that was downloading? */ + if (large_file != NULL && finished_payload == large_file) { + large_file = NULL; + } + updated = 1; } } else { -- 2.43.1