[pacman-dev] [PATCH] Timeout when download problems occur
We already set fetchTimeout to 10 seconds, but this does not apply for some of the connection setup steps such as connect(). This left us hanging in a few places where we would normally be stuck in a system call for an indeterminate amount of time. This addresses FS#15369, which has grown to include and duplicate several other bugs and failures such as specifying a non-existent FTP server, DSL disconnections, etc. Signed-off-by: Dan McGee <dan@archlinux.org> --- Anyone that wants to look this over, it was pretty simple and kills one of our long-lingering issues before a 3.5.0 release. The output isn't necessarily beautiful, but it at least doesn't hang forever now. I tested this by adding a line like Server = ftp://192.168.4.4/$repo/os/$arch to my /etc/pacman.d/mirrorlist file, and then running pacman -Sy. $ sudo ./src/pacman/pacman -Sy Password: :: Synchronizing package databases... testing is up to date error: failed retrieving file 'core.db' from 192.168.4.4 : Interrupted system call core is up to date error: failed retrieving file 'extra.db' from 192.168.4.4 : Interrupted system call extra is up to date community-testing is up to date error: failed retrieving file 'multilib.db' from 192.168.4.4 : Interrupted system call multilib is up to date error: failed retrieving file 'community.db' from 192.168.4.4 : Interrupted system call community is up to date -Dan lib/libalpm/dload.c | 21 +++++++++++++++++---- 1 files changed, 17 insertions(+), 4 deletions(-) diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c index 7a98eb1..44ff17c 100644 --- a/lib/libalpm/dload.c +++ b/lib/libalpm/dload.c @@ -88,10 +88,10 @@ static const char *gethost(struct url *fileurl) int dload_interrupted; static void inthandler(int signum) { - dload_interrupted = 1; + dload_interrupted = signum; } -#define check_stop() if(dload_interrupted) { ret = -1; goto cleanup; } +#define check_stop() do { alarm(0); if(dload_interrupted) { ret = -1; goto cleanup; } } while(0) enum sighandlers { OLD = 0, NEW = 1 }; static int download_internal(const char *url, const char *localpath, @@ -102,7 +102,7 @@ static int download_internal(const char *url, const char *localpath, off_t dl_thisfile = 0; ssize_t nread = 0; char *tempfile, *destfile, *filename; - struct sigaction sig_pipe[2], sig_int[2]; + struct sigaction sig_pipe[2], sig_int[2], sig_alrm[2]; off_t local_size = 0; time_t local_time = 0; @@ -169,6 +169,12 @@ static int download_internal(const char *url, const char *localpath, sigaction(SIGINT, NULL, &sig_int[OLD]); sigaction(SIGINT, &sig_int[NEW], NULL); + sig_alrm[NEW].sa_handler = &inthandler; + sigemptyset(&sig_alrm[NEW].sa_mask); + sig_alrm[NEW].sa_flags = 0; + sigaction(SIGALRM, NULL, &sig_alrm[OLD]); + sigaction(SIGALRM, &sig_alrm[NEW], NULL); + /* NOTE: libfetch does not reset the error code, be sure to do it before * calls into the library */ @@ -184,7 +190,9 @@ static int download_internal(const char *url, const char *localpath, * trouble in trying to do both size and "if-modified-since" logic in a * non-stat request, so avoid it. */ fetchLastErrCode = 0; + alarm(fetchTimeout); if(fetchStat(fileurl, &ust, "") == -1) { + alarm(0); pm_errno = PM_ERR_LIBFETCH; _alpm_log(PM_LOG_ERROR, _("failed retrieving file '%s' from %s : %s\n"), filename, gethost(fileurl), fetchLastErrString); @@ -211,6 +219,7 @@ static int download_internal(const char *url, const char *localpath, } fetchLastErrCode = 0; + alarm(fetchTimeout); dlf = fetchGet(fileurl, ""); check_stop(); @@ -252,6 +261,7 @@ static int download_internal(const char *url, const char *localpath, handle->dlcb(filename, 0, ust.size); } + alarm(fetchTimeout); while((nread = fetchIO_read(dlf, buffer, PM_DLBUF_LEN)) > 0) { check_stop(); size_t nwritten = 0; @@ -268,7 +278,9 @@ static int download_internal(const char *url, const char *localpath, if(handle->dlcb) { handle->dlcb(filename, dl_thisfile, ust.size); } + alarm(fetchTimeout); } + alarm(0); /* did the transfer complete normally? */ if (nread == -1) { @@ -333,10 +345,11 @@ cleanup: fetchFreeURL(fileurl); /* restore the old signal handlers */ + sigaction(SIGALRM, &sig_alrm[OLD], NULL); sigaction(SIGINT, &sig_int[OLD], NULL); sigaction(SIGPIPE, &sig_pipe[OLD], NULL); /* if we were interrupted, trip the old handler */ - if(dload_interrupted) { + if(dload_interrupted == SIGINT) { raise(SIGINT); } -- 1.7.4
On 16/02/11 09:53, Dan McGee wrote:
We already set fetchTimeout to 10 seconds, but this does not apply for some of the connection setup steps such as connect(). This left us hanging in a few places where we would normally be stuck in a system call for an indeterminate amount of time.
This addresses FS#15369, which has grown to include and duplicate several other bugs and failures such as specifying a non-existent FTP server, DSL disconnections, etc.
Signed-off-by: Dan McGee<dan@archlinux.org> ---
I tested this by the "pull out the internet cable" method. Before this patch, I got a time out. After the patch, I got a freeze that I can't crtl+c out of:
sudo ./src/pacman/pacman -Sy Password: :: Synchronizing package databases... kernel64 is up to date testing is up to date core is up to date extra is up to date community-testing is up to date ^C^C^C^C^C^C^C^C 228.5K 118.2K/s 00:00:01 [###########-----------] 53%
Allan
participants (2)
-
Allan McRae
-
Dan McGee