[pacman-dev] [PATCH] dload: handle irregular URLs

Dave Reisner d at falconindy.com
Sat Jun 11 17:56:02 EDT 2011


On Sat, Jun 11, 2011 at 03:16:42PM -0400, Dave Reisner wrote:
> URLs might end with a slash and follow redirects, or could be a
> generated by a script such as /getpkg.php?id=12345. In both cases, we
> may have a better filename that we can write to, taken from either
> content-disposition header, or the effective URL.
> 
> Specific to the first case, we write to a temporary file of the format
> 'alpmtmp.XXXXXX', where XXXXXX is randomized by mkstemp(3). Since this
> is a randomly generated file, we cannot support resuming and the file is
> unlinked in the event of an interrupt.
> 
> We also run into the possibility of changing out the filename from under
> alpm on a -U operation, so callers of _alpm_download can optionally pass
> a pointer to a *char to be filled in by curl_download_internal with the
> actual filename we wrote to. Any sync operation will pass a NULL pointer
> here, as we rely on specific names for packages from a mirror.
> 
> Fixes FS#22645.
> 
> Signed-off-by: Dave Reisner <d at falconindy.com>
> ---
> There's one hack in here that I'm not happy with, so feedback is welcome.
> Basically, the signature file is guaranteed to have a length, because we
> simply append '.sig' to the URL. In the case of a URL ending with a /, this
> results in downloading a file called '.sig'. This patchwork hardcodes the
> edge case and forces the use of a temporary file. It's not ideal, but it
> works.
> 
> Any other comments are also, of course, welcome.
> 
> d

Hurray, I'm replying to myself. So, there's a few things that have come
up...

1) We're already setting a callback to read and parse the headers. If
the content-disposition header exists, we can check for the file
existing and abort the transfer. However, this puts us in an awkward
position when the pre-existing file is incomplete. We can't resume this
file because the request on the wire is asking for the start of the
file. I think in this case, the front end would claim this file to be
corrupt and ask the user to delete it.

2) Florian brought up the idea that we can resume a transfer in the case
of needing a tempfile by naming the tempfile based on the hash of the
URL. In theory, this seems sound, but it opens up issues if we ever
decide to allow downloading packages as non-root because we'd be making
the tempfile on our own, and it leaves us prone to a symlink attack as
well as an unlikely race condition.

3) I think in case of #1 with redirects, we need to do something
similar, and parse the Location header to grab the filename when the
response code is <300.

4) I have no pants.

d


More information about the pacman-dev mailing list