[pacman-dev] [PATCH] RFC: Remove "duplicate" download error messages
This is a very incomplete and hacky patch... But it does do this: allan@mando ~/arch/code/pacman (patchqueue)
pacman -Syy :: Synchronizing package databases... core 131.1 KiB 47.0 KiB/s 00:03 [######################] 100% extra 1645.7 KiB 490 KiB/s 00:03 [######################] 100% community 5.4 MiB 161 KiB/s 00:34 [######################] 100% error: failed retrieving file 'extra.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'community.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404
allan@mando ~/arch/code/pacman (patchqueue)
sudo ./build/pacman -Syy [sudo] password for allan: :: Synchronizing package databases... core 131.1 KiB 51.2 KiB/s 00:03 [######################] 100% extra 1645.7 KiB 494 KiB/s 00:03 [######################] 100% community 5.4 MiB 172 KiB/s 00:32 [######################] 100% error: failed retrieving file(s) from allanmcrae.com : The requested URL returned error: 404
There are a few things to note. 1) we lose the file names for the failed downloads. And we lose information of how many downloads failed. 2) I only changed the messages for download failures so far. So (e.g.) having a file exceed its expected download size will still report the file name. 3) debug output still gives file names 4) note the "error" in the messages above, yet pacman was still successful as I have an actual mirror configured. I guess this (and others?) should be changed to warning. We get an overall error message if there is complete failure anyway. 5) I have been super lazy for this PoC, switching alpm_list_remove_dupes() to use string comparisons, rather than pointer comparisons. This will need fixed. Things to discuss: - do we need to report the number of failures from a server. Currently 1 and 200 failures are treated the same. This involves doing more than a dedup of the error list. - converting errors to warnings. They are clearly not errors until they are... Is the final line enough: error: failed to synchronize all databases (failed to retrieve some files) --- lib/libalpm/alpm_list.c | 2 +- lib/libalpm/dload.c | 12 ++++++------ src/pacman/callback.c | 7 +++++-- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/lib/libalpm/alpm_list.c b/lib/libalpm/alpm_list.c index 5a684181..bc243898 100644 --- a/lib/libalpm/alpm_list.c +++ b/lib/libalpm/alpm_list.c @@ -338,7 +338,7 @@ alpm_list_t SYMEXPORT *alpm_list_remove_dupes(const alpm_list_t *list) const alpm_list_t *lp = list; alpm_list_t *newlist = NULL; while(lp) { - if(!alpm_list_find_ptr(newlist, lp->data)) { + if(!alpm_list_find_str(newlist, lp->data)) { if(alpm_list_append(&newlist, lp->data) == NULL) { alpm_list_free(newlist); return NULL; diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c index 0d0936c7..702bd059 100644 --- a/lib/libalpm/dload.c +++ b/lib/libalpm/dload.c @@ -421,8 +421,8 @@ static int curl_check_finished_download(CURLM *curlm, CURLMsg *msg, snprintf(payload->error_buffer, sizeof(payload->error_buffer), "The requested URL returned error: %ld", payload->respcode); _alpm_log(handle, ALPM_LOG_ERROR, - _("failed retrieving file '%s' from %s : %s\n"), - payload->remote_name, hostname, payload->error_buffer); + _("failed retrieving file(s) from %s : %s\n"), + hostname, payload->error_buffer); } if(curl_retry_next_server(curlm, curl, payload) == 0) { return 2; @@ -446,8 +446,8 @@ static int curl_check_finished_download(CURLM *curlm, CURLMsg *msg, payload->unlink_on_fail = 1; handle->pm_errno = ALPM_ERR_SERVER_BAD_URL; _alpm_log(handle, ALPM_LOG_ERROR, - _("failed retrieving file '%s' from %s : %s\n"), - payload->remote_name, hostname, payload->error_buffer); + _("failed retrieving file(s) from %s : %s\n"), + hostname, payload->error_buffer); if(curl_retry_next_server(curlm, curl, payload) == 0) { return 2; } else { @@ -461,8 +461,8 @@ static int curl_check_finished_download(CURLM *curlm, CURLMsg *msg, if(!payload->errors_ok) { handle->pm_errno = ALPM_ERR_LIBCURL; _alpm_log(handle, ALPM_LOG_ERROR, - _("failed retrieving file '%s' from %s : %s\n"), - payload->remote_name, hostname, payload->error_buffer); + _("failed retrieving file(s) from %s : %s\n"), + hostname, payload->error_buffer); } else { _alpm_log(handle, ALPM_LOG_DEBUG, "failed retrieving file '%s' from %s : %s\n", diff --git a/src/pacman/callback.c b/src/pacman/callback.c index 40973340..d07df816 100644 --- a/src/pacman/callback.c +++ b/src/pacman/callback.c @@ -212,9 +212,12 @@ static void fill_progress(const int bar_percent, const int disp_percent, } static void flush_output_list(void) { - alpm_list_t *i = NULL; + alpm_list_t *i = NULL, *rmdup; fflush(stdout); - for(i = output; i; i = i->next) { + + rmdup = alpm_list_remove_dupes(output); + + for(i = rmdup; i; i = i->next) { fputs((const char *)i->data, stderr); } fflush(stderr); -- 2.30.1
On 03/08/21 at 02:12pm, Allan McRae wrote:
This is a very incomplete and hacky patch... But it does do this:
allan@mando ~/arch/code/pacman (patchqueue)
pacman -Syy :: Synchronizing package databases... core 131.1 KiB 47.0 KiB/s 00:03 [######################] 100% extra 1645.7 KiB 490 KiB/s 00:03 [######################] 100% community 5.4 MiB 161 KiB/s 00:34 [######################] 100% error: failed retrieving file 'extra.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'community.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404
allan@mando ~/arch/code/pacman (patchqueue)
sudo ./build/pacman -Syy [sudo] password for allan: :: Synchronizing package databases... core 131.1 KiB 51.2 KiB/s 00:03 [######################] 100% extra 1645.7 KiB 494 KiB/s 00:03 [######################] 100% community 5.4 MiB 172 KiB/s 00:32 [######################] 100% error: failed retrieving file(s) from allanmcrae.com : The requested URL returned error: 404
There are a few things to note. 1) we lose the file names for the failed downloads. And we lose information of how many downloads failed. 2) I only changed the messages for download failures so far. So (e.g.) having a file exceed its expected download size will still report the file name. 3) debug output still gives file names 4) note the "error" in the messages above, yet pacman was still successful as I have an actual mirror configured. I guess this (and others?) should be changed to warning. We get an overall error message if there is complete failure anyway. 5) I have been super lazy for this PoC, switching alpm_list_remove_dupes() to use string comparisons, rather than pointer comparisons. This will need fixed.
Things to discuss: - do we need to report the number of failures from a server. Currently 1 and 200 failures are treated the same. This involves doing more than a dedup of the error list. - converting errors to warnings. They are clearly not errors until they are... Is the final line enough: error: failed to synchronize all databases (failed to retrieve some files)
My main problem with this approach is that now alpm is spewing a bunch of identical errors to front-ends, so now unless a front-end dedups them the same way, we're actually making things worse. We could dedup in alpm, but then I don't think there's really an obvious place to actually send the message. Sending it at the end makes it impossible for the user to infer that they might have a bad mirror and should cancel the download and switch. Sending it on the first occurrence does the same because they don't know if it's just one file or a recurring problem. What about actually removing a bad mirror instead of deduplicating the resulting messages? Maybe start using the next mirror for subsequent payloads after n failures or immediately for things like an unresolvable host? If we do make the errors more generic, I think turning them into warnings and relying on the final "failed to retrieve some files" to indicate actual failure is fine.
On 03/14/21 at 09:40pm, Andrew Gregory wrote: ...
What about actually removing a bad mirror instead of deduplicating the resulting messages? Maybe start using the next mirror for subsequent payloads after n failures or immediately for things like an unresolvable host?
Took a quick stab at a first draft of this on my dload-server-skip branch. With the following config: [core] Server = http://allanmcrae1.com/ Server = http://allanmcrae1.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Include = /etc/pacman.d/mirrorlist A failure limit of 3 gives the following output: :: Synchronizing package databases... core 129.9 KiB 135 KiB/s 00:01 [###################################] 100% error: failed retrieving file 'core.db' from allanmcrae1.com : Could not resolve host: allanmcrae1.com error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404 It keys the errors count off of the full server url, we might want to switch to just hostname to avoid repeating errors across db-specific servers. It also needs a warning/info message when it starts ignoring a server. apg
On 19/3/21 6:46 am, Andrew Gregory wrote:
On 03/14/21 at 09:40pm, Andrew Gregory wrote: ...
What about actually removing a bad mirror instead of deduplicating the resulting messages? Maybe start using the next mirror for subsequent payloads after n failures or immediately for things like an unresolvable host?
Took a quick stab at a first draft of this on my dload-server-skip branch. With the following config:
[core] Server = http://allanmcrae1.com/ Server = http://allanmcrae1.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Include = /etc/pacman.d/mirrorlist
A failure limit of 3 gives the following output:
:: Synchronizing package databases... core 129.9 KiB 135 KiB/s 00:01 [###################################] 100% error: failed retrieving file 'core.db' from allanmcrae1.com : Could not resolve host: allanmcrae1.com error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404
It keys the errors count off of the full server url, we might want to switch to just hostname to avoid repeating errors across db-specific servers. It also needs a warning/info message when it starts ignoring a server.
I had a look at the branch. I like the approach. I'd suggest removing the host not found and 404 errors (and other?) and just printing a warning when a server is remove from the list. I think just using hostname rather than the full server URL would be better. I'm going to be largely unavailable until after Easter, but will try and look at least briefly at anything submitted in that time. Allan
On Fri, 19 Mar 2021 at 12:22, Allan McRae <allan@archlinux.org> wrote:
On 19/3/21 6:46 am, Andrew Gregory wrote:
On 03/14/21 at 09:40pm, Andrew Gregory wrote: ...
What about actually removing a bad mirror instead of deduplicating the resulting messages? Maybe start using the next mirror for subsequent payloads after n failures or immediately for things like an unresolvable host?
Took a quick stab at a first draft of this on my dload-server-skip branch. With the following config:
[core] Server = http://allanmcrae1.com/ Server = http://allanmcrae1.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Server = http://allanmcrae.com/ Include = /etc/pacman.d/mirrorlist
A failure limit of 3 gives the following output:
:: Synchronizing package databases... core 129.9 KiB 135 KiB/s 00:01 [###################################] 100% error: failed retrieving file 'core.db' from allanmcrae1.com : Could not resolve host: allanmcrae1.com error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404 error: failed retrieving file 'core.db' from allanmcrae.com : The requested URL returned error: 404
It keys the errors count off of the full server url, we might want to switch to just hostname to avoid repeating errors across db-specific servers. It also needs a warning/info message when it starts ignoring a server.
I had a look at the branch. I like the approach.
I'd suggest removing the host not found and 404 errors (and other?) and just printing a warning when a server is remove from the list. I think just using hostname rather than the full server URL would be better.
IMHO the separate commit messages - host/404/other - are rather helpful, since they clearly highlight the problem. In particular - server being unreachable (internet link went down, broken DNS) vs file delivery issue (broken mirror, temporary issue). -Emil
participants (3)
-
Allan McRae
-
Andrew Gregory
-
Emil Velikov