[pacman-dev] [PATCH] download: major refactor to address lingering issues

Sun Nov 15 08:23:31 EST 2009

On Thu, Nov 12, 2009 at 9:09 AM, Dan McGee <dan at archlinux.org> wrote:
> Sorry for this being such a huge patch, but I believe it is necessary for
> quite a few reasons which I will attempt to explain herein. I've been
> mulling this over for a while, but wasn't super happy with making the
> download interface more complex. Instead, if we carefully order things in
> the internal download code, we can actually make the interface simpler.
>
> 1. FS#15657 - This involves `name.db.tar.gz.part` files being left around the
> filesystem, and then causing all sorts of issues when someone attempts to
> rerun the operation they canceled. We need to ensure that if we resume a
> download, we are resuming it on exactly the same file; if we cannot be
> almost postive of that then we need to start over.
>

That's quite different to the simple solution we had before : simply
disable resuming for repo db, which are quite small anyway :)

> 2. http://www.mail-archive.com/pacman-dev@archlinux.org/msg03536.html - Here
> we have a lighttpd bug to ruin the day. If we send both a Range: header and
> If-Modified-Since: header across the wire in a GET request, lighttpd doesn't
> do what we want in several cases. If the file hadn't been modified, it
> returns a '304 Not Modified' instead of a '206 Partial Content'. We need to
> do a stat (e.g. HEAD in HTTP terms) operation here, and the proceed
> accordingly based off the values we get back from it.
>

These two issues look completely different.
The link you gave was about If-Modified-Since not working correctly,
and lighttpd not returning '304 Not Modified' because of some
time/timezone issues.
Besides, this should have been fixed in 1.4.24 which is in arch now.
I guess you should report your issue to lighttpd :)

>From libfetch point of view, doing a Stat is fine.
What seemed to cause issue for some users is doing a Get right away,
and cancel when mtimes matched.
See 6f97842ab22eb50fdc689e8aa2e95688d015fa74

> 3. The mtime stuff was rather ugly, and relied on the called function to
> write back to a passed in reference, which isn't the greatest. Instead, use
> the power of the filesystem to contain this info. Every file downloaded
> internally is now carefully timestamped with the remote file time. This
> should allow the resume logic to work. In order to guarantee this, we need
> to implement a signal handler that catches interrupts, notifies the running
> code, and causes it to set the mtimes on the file. It then rethrows the
> signal so the pacman signal handler (or any frontend) works as expected.
>
> 4. We did a lot of funky stuff in trying to track the DB last modified time.
> It is a lot easier to just keep the downloaded DB file around and track the
> time on that rather than in a funky dot file. It also kills a lot of code.
>

I always find signal handling quite tricky, but yeah, the idea of
using the file itself to track mtime rather than a dot file sounds
good.
Also this indeed makes resuming much safer.

I was just thinking about backward compatibility (old downloaded file
and new download code), but I think it is fine :
- for already downloaded packages , the download code is not called
- for partially downloaded packages, they will be discarded and the
download will start over. But if you still have partially downloaded
packages when upgrading pacman, they are probably old and outdated.

> 5. For GPG verification of the databases down the road, we are going to need
> the DB file around for at least a short bit of time anyway, so this gets us
> closer to that.
>

Well, I do not think this would have been a problem. We probably want
to do gpg verification right after downloading, and before extracting
(and then removing) the file, don't we ?

Anyway, I did not have big issues with the old code, but I do not have
any with the new one either. Code review and testing did not reveal
any issues.
And the new one does have several advantages so it's all good.