[pacman-dev] [PATCH 0/1] libalpm: fix resuming after HTTP error
Hi, there. When downloading packages, the libalpm may save a body of a unsuccessful HTTP response (e.g. 404). The saved part will then be treated as a partially download package, from where the download will be resumed, resulting in a corrupted package. Steps to reproduce: 1. Prepend a invalid mirror to mirrorlist, say http://example.com/$repo/os/$arch 2. Pick any package, say inkscape 3. (Re-)Install the chosen package with `pacman -S` 4. Pacman gets a HTTP 404 error when trying the newly added "mirror", saving its body as the partially downloaded package 5. Pacman retries the subsequent real mirrors, resuming its download 6. At last, pacman complains with "File ... corrupted (invalid or corrupted package (PGP signature))" This patch fixs the problem by letting curl fails on HTTP error, preventing it from accepting such an invalid HTTP body. Regards, Hung-I Wang --- Hung-I Wang (1): libalpm: fix resuming after HTTP error lib/libalpm/dload.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) -- 2.31.1
`curl_retry_next_server` tries to resume a download whenever possible (introduced in 8bf17b2) even if its preceding HTTP request returns an error (with status code >= 400, e.g. 404 when a mirror is only partially synced). It may result in a corrupted package if the preceding HTTP response with error carries a non-empty body, which is then written to a tempfile as if it is part of the package binary. By activating `CURLOPT_FAILONERROR`, the tempfile won't be written unless the HTTP response indicates a successful status. Signed-off-by: Hung-I Wang <whygowe@gmail.com> --- lib/libalpm/dload.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c index 2d8b4d6d..faa3d9f2 100644 --- a/lib/libalpm/dload.c +++ b/lib/libalpm/dload.c @@ -338,6 +338,7 @@ static void curl_set_handle_opts(CURL *curl, struct dload_payload *payload) curl_easy_setopt(curl, CURLOPT_TCP_KEEPINTVL, 60L); curl_easy_setopt(curl, CURLOPT_HTTPAUTH, CURLAUTH_ANY); curl_easy_setopt(curl, CURLOPT_PRIVATE, (void *)payload); + curl_easy_setopt(curl, CURLOPT_FAILONERROR, 1); _alpm_log(handle, ALPM_LOG_DEBUG, "%s: url is %s\n", payload->remote_name, payload->fileurl); @@ -427,7 +428,8 @@ static int curl_retry_next_server(CURLM *curlm, CURL *curl, struct dload_payload len = strlen(server) + strlen(payload->filepath) + 2; MALLOC(payload->fileurl, len, RET_ERR(handle, ALPM_ERR_MEMORY, -1)); snprintf(payload->fileurl, len, "%s/%s", server, payload->filepath); - + _alpm_log(handle, ALPM_LOG_DEBUG, "%s: url is %s\n", + payload->remote_name, payload->fileurl); fflush(payload->localf); @@ -496,6 +498,7 @@ static int curl_check_finished_download(CURLM *curlm, CURLMsg *msg, /* was it a success? */ switch(curlerr) { case CURLE_OK: + case CURLE_HTTP_RETURNED_ERROR: /* get http/ftp response code */ _alpm_log(handle, ALPM_LOG_DEBUG, "%s: response code %ld\n", payload->remote_name, payload->respcode); -- 2.31.1
On 2/6/21 9:44 pm, Hung-I Wang wrote:
`curl_retry_next_server` tries to resume a download whenever possible (introduced in 8bf17b2) even if its preceding HTTP request returns an error (with status code >= 400, e.g. 404 when a mirror is only partially synced).
It may result in a corrupted package if the preceding HTTP response with error carries a non-empty body, which is then written to a tempfile as if it is part of the package binary.
By activating `CURLOPT_FAILONERROR`, the tempfile won't be written unless the HTTP response indicates a successful status.
Thanks - that option is much more simple than the approaches we had implemented.
Signed-off-by: Hung-I Wang <whygowe@gmail.com> --- lib/libalpm/dload.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c index 2d8b4d6d..faa3d9f2 100644 --- a/lib/libalpm/dload.c +++ b/lib/libalpm/dload.c @@ -338,6 +338,7 @@ static void curl_set_handle_opts(CURL *curl, struct dload_payload *payload) curl_easy_setopt(curl, CURLOPT_TCP_KEEPINTVL, 60L); curl_easy_setopt(curl, CURLOPT_HTTPAUTH, CURLAUTH_ANY); curl_easy_setopt(curl, CURLOPT_PRIVATE, (void *)payload); + curl_easy_setopt(curl, CURLOPT_FAILONERROR, 1);
_alpm_log(handle, ALPM_LOG_DEBUG, "%s: url is %s\n", payload->remote_name, payload->fileurl); @@ -427,7 +428,8 @@ static int curl_retry_next_server(CURLM *curlm, CURL *curl, struct dload_payload len = strlen(server) + strlen(payload->filepath) + 2; MALLOC(payload->fileurl, len, RET_ERR(handle, ALPM_ERR_MEMORY, -1)); snprintf(payload->fileurl, len, "%s/%s", server, payload->filepath); - + _alpm_log(handle, ALPM_LOG_DEBUG, "%s: url is %s\n", + payload->remote_name, payload->fileurl);
fflush(payload->localf);
@@ -496,6 +498,7 @@ static int curl_check_finished_download(CURLM *curlm, CURLMsg *msg, /* was it a success? */ switch(curlerr) { case CURLE_OK:
Can break here
+ case CURLE_HTTP_RETURNED_ERROR: /* get http/ftp response code */ _alpm_log(handle, ALPM_LOG_DEBUG, "%s: response code %ld\n", payload->remote_name, payload->respcode);
and here we would no longer need to check for respcode>=400
On 06/02/21 at 07:44pm, Hung-I Wang wrote:
`curl_retry_next_server` tries to resume a download whenever possible (introduced in 8bf17b2) even if its preceding HTTP request returns an error (with status code >= 400, e.g. 404 when a mirror is only partially synced).
It may result in a corrupted package if the preceding HTTP response with error carries a non-empty body, which is then written to a tempfile as if it is part of the package binary.
By activating `CURLOPT_FAILONERROR`, the tempfile won't be written unless the HTTP response indicates a successful status.
Signed-off-by: Hung-I Wang <whygowe@gmail.com> --- lib/libalpm/dload.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
See 7a5e41925f72d838eaa611427e5ae89b1f57215f: dload: avoid using CURLOPT_FAILONERROR Use of this flag causes connections to be closed on 404s -- a common occurrence when your config sets DatabaseOptional. Handle the error gracefully, so that the connection can be reused. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Allan McRae <allan@archlinux.org>
On 3/6/21 12:07 am, Andrew Gregory wrote:
On 06/02/21 at 07:44pm, Hung-I Wang wrote:
`curl_retry_next_server` tries to resume a download whenever possible (introduced in 8bf17b2) even if its preceding HTTP request returns an error (with status code >= 400, e.g. 404 when a mirror is only partially synced).
It may result in a corrupted package if the preceding HTTP response with error carries a non-empty body, which is then written to a tempfile as if it is part of the package binary.
By activating `CURLOPT_FAILONERROR`, the tempfile won't be written unless the HTTP response indicates a successful status.
Signed-off-by: Hung-I Wang <whygowe@gmail.com> --- lib/libalpm/dload.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
See 7a5e41925f72d838eaa611427e5ae89b1f57215f:
dload: avoid using CURLOPT_FAILONERROR
Use of this flag causes connections to be closed on 404s -- a common occurrence when your config sets DatabaseOptional. Handle the error gracefully, so that the connection can be reused.
Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Allan McRae <allan@archlinux.org>
Ah - thanks. Will stick with the patch I submitted then. Allan
participants (3)
-
Allan McRae
-
Andrew Gregory
-
Hung-I Wang