On Thu, Nov 12, 2009 at 2:09 AM, Dan McGee <dan@archlinux.org> wrote:
Sorry for this being such a huge patch, but I believe it is necessary for quite a few reasons which I will attempt to explain herein. I've been mulling this over for a while, but wasn't super happy with making the download interface more complex. Instead, if we carefully order things in the internal download code, we can actually make the interface simpler.
1. FS#15657 - This involves `name.db.tar.gz.part` files being left around the filesystem, and then causing all sorts of issues when someone attempts to rerun the operation they canceled. We need to ensure that if we resume a download, we are resuming it on exactly the same file; if we cannot be almost postive of that then we need to start over.
2. http://www.mail-archive.com/pacman-dev@archlinux.org/msg03536.html - Here we have a lighttpd bug to ruin the day. If we send both a Range: header and If-Modified-Since: header across the wire in a GET request, lighttpd doesn't do what we want in several cases. If the file hadn't been modified, it returns a '304 Not Modified' instead of a '206 Partial Content'. We need to do a stat (e.g. HEAD in HTTP terms) operation here, and the proceed accordingly based off the values we get back from it.
3. The mtime stuff was rather ugly, and relied on the called function to write back to a passed in reference, which isn't the greatest. Instead, use the power of the filesystem to contain this info. Every file downloaded internally is now carefully timestamped with the remote file time. This should allow the resume logic to work. In order to guarantee this, we need to implement a signal handler that catches interrupts, notifies the running code, and causes it to set the mtimes on the file. It then rethrows the signal so the pacman signal handler (or any frontend) works as expected.
4. We did a lot of funky stuff in trying to track the DB last modified time. It is a lot easier to just keep the downloaded DB file around and track the time on that rather than in a funky dot file. It also kills a lot of code.
5. For GPG verification of the databases down the road, we are going to need the DB file around for at least a short bit of time anyway, so this gets us closer to that.
Signed-off-by: Dan McGee <dan@archlinux.org> --- lib/libalpm/alpm.h | 8 +-- lib/libalpm/be_files.c | 96 ++------------------------- lib/libalpm/dload.c | 177 +++++++++++++++++++++++++++++++++--------------- lib/libalpm/dload.h | 4 +- src/pacman/pacman.c | 11 +++- 5 files changed, 141 insertions(+), 155 deletions(-)
So yes, I know this patch is huge, but I'm really really *really* hoping someone (or a few people!) would like to review it. Xavier, I know you asked me about this stuff a while back, and I finally got back around to it and it is in a much more working form. It is hopefully well commented too. I should note this went through about 3 distinct iterations locally tonight before this final version came about. Function arguments were dropped, added, and dropped again; calls into libfetch were reordered and reworked and changed, etc. It took a good amount of time to get it to work for all of the various partial download cases (both DB *and* packages, yes, it actually works!) as well as keeping the force flag working and just downloading in the most straightforward start-to-end case. Anyway, it may be a tad easier to look at it here: http://code.toofishes.net/cgit/dan/pacman.git/commit/?h=download&id=8a15e0cbebaff0c34d2d837b37dd6279cff44e32 or in a side by side diff format: http://code.toofishes.net/cgit/dan/pacman.git/commit/?h=download&id=8a15e0cbebaff0c34d2d837b37dd6279cff44e32&ss=1 Thanks guys! And don't ask why I am up at 2 AM my time working on this, it feels like the old days of hacking again. :) -Dan