[pacman-dev] bug in pacman 3.3.2 retrieving of repos?
Hello List, pacman 3.3.2 libfetch backend doesn't retrieve an updated repo via http (tested with "touch repo.db.tar.gz" on the http server) here. I checked the http headers with curl just to be sure that my setup isn't f***ed up in some way and they are updated just fine after every touch on the file, so i has to be a problem with libfetch or pacman itself. A local revert to pacman 3.3.0 (no build-changes or libfetch changes and the exact same build-environment) fixes this problem here so it looks like a messed up usage of libfetch in pacman 3.3.2 release. Can someone else reproduce this behaviour? Because i have no clue on how to use libfetch the right way without reading the documentation and analyse the code in pacman afterwards, i hope someone else wants to hound it down. Marc
On Mon, Oct 12, 2009 at 7:14 AM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Hello List,
pacman 3.3.2 libfetch backend doesn't retrieve an updated repo via http (tested with "touch repo.db.tar.gz" on the http server) here.
I checked the http headers with curl just to be sure that my setup isn't f***ed up in some way and they are updated just fine after every touch on the file, so i has to be a problem with libfetch or pacman itself.
A local revert to pacman 3.3.0 (no build-changes or libfetch changes and the exact same build-environment) fixes this problem here so it looks like a messed up usage of libfetch in pacman 3.3.2 release.
Can someone else reproduce this behaviour?
No. I can't reproduce this *at all*, anywhere.
Because i have no clue on how to use libfetch the right way without reading the documentation and analyse the code in pacman afterwards, i hope someone else wants to hound it down.
This is one of those emails that is slightly aggravating. You tried to generalize the problem so much, assuming everyone else saw it, that you didn't include anything to even reproduce it! What mirror? What repo? What does it look like? What commands did you run? Did you try with '--debug' to see if any helpful messages showed up there during the download? -Dan
Hello Dan, Am Montag, den 12.10.2009, 07:20 -0500 schrieb Dan McGee:
On Mon, Oct 12, 2009 at 7:14 AM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Hello List,
pacman 3.3.2 libfetch backend doesn't retrieve an updated repo via http (tested with "touch repo.db.tar.gz" on the http server) here.
I checked the http headers with curl just to be sure that my setup isn't f***ed up in some way and they are updated just fine after every touch on the file, so i has to be a problem with libfetch or pacman itself.
A local revert to pacman 3.3.0 (no build-changes or libfetch changes and the exact same build-environment) fixes this problem here so it looks like a messed up usage of libfetch in pacman 3.3.2 release.
Can someone else reproduce this behaviour?
No. I can't reproduce this *at all*, anywhere.
Because i have no clue on how to use libfetch the right way without reading the documentation and analyse the code in pacman afterwards, i hope someone else wants to hound it down.
This is one of those emails that is slightly aggravating. You tried to generalize the problem so much, assuming everyone else saw it, that you didn't include anything to even reproduce it!
sorry, i'm not a native english writer and i often loose some details during translation because of that, i'll try to clarify it below.
What mirror? What repo? What does it look like? What commands did you run? Did you try with '--debug' to see if any helpful messages showed up there during the download?
It is a local mirror served on lighttpd. Pacman 3.3.2 doesn't retrieve an regular (with repo-add or repo-remove) updated repo tarball on "pacman -Sy" even if it changed the size or was touched serval times with "touch repo.db.tar.gz" on the http server. The mtime exposed in the http response headers would be the information on which pacmans libfetch download determines if a repo was updated and should be downloaded, right? I checked with "curl -v http://server/repo.db.tar.gz" and the timestamps exposed via http header are updated after every repo update or touch on the repo file, which was the case. Steps to reproduce: client$ pavman -V # is 3.3.2 # to get the actual version before the following touch client$ pacman -Syy # update mtime on repo server$ touch repo.db.tar.gz # this is where it fails client$ pacman -Sy # it doesn't download repo and debug says: # debug: mtimes are identical, skipping repo.db.tar.gz # local downgrade to pacman 3.3.0 client$ pacman -U pacman-3.3.0-i686.pkg.tar.xz client$ pavman -V # is 3.3.0 # works just fine, mtime updates are spotted client$ pacman -Sy # debug: sync: new mtime for repo: 1255352714 After the pacman -Sy calls i checked the http response headers and they stayed the same. I also deactivated ShowSize and TotalDownload options in pacman.conf for this test but it looks unrelated as it doesn't change without this non-default options. I checked 3.3.1 as well and it looks like the problem i'm facing was introduced with it. Well it could be a local problem with my environment of course, but shouldn't 3.3.0 then suffer from the same problem then? As there were some fiddling with the libfetch download code in pacman on version 3.3.1 and 3.3.2 i thought it might be related to that. Marc
Am Montag, den 12.10.2009, 16:12 +0200 schrieb Marc - A. Dahlhaus [ Administration | Westermann GmbH ]: --8<--
What mirror? What repo? What does it look like? What commands did you run? Did you try with '--debug' to see if any helpful messages showed up there during the download?
It is a local mirror served on lighttpd.
s/local mirror/local repo/ Marc
On Mon, Oct 12, 2009 at 4:12 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Well it could be a local problem with my environment of course, but shouldn't 3.3.0 then suffer from the same problem then? As there were some fiddling with the libfetch download code in pacman on version 3.3.1 and 3.3.2 i thought it might be related to that.
I already answered this question, see my previous mail. And from libfetch man page : If the `i' (if-modified-since) flag is specified, the library will try to fetch the content only if it is newer than last_modified. For HTTP an If-Modified-Since HTTP header is sent. For FTP a MTDM command is sent first and compared locally. For FILE the source file is compared. You the http header you need to check is If Modified Since
Am Montag, den 12.10.2009, 16:36 +0200 schrieb Xavier:
On Mon, Oct 12, 2009 at 4:12 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Well it could be a local problem with my environment of course, but shouldn't 3.3.0 then suffer from the same problem then? As there were some fiddling with the libfetch download code in pacman on version 3.3.1 and 3.3.2 i thought it might be related to that.
I already answered this question, see my previous mail. And from libfetch man page : If the `i' (if-modified-since) flag is specified, the library will try to fetch the content only if it is newer than last_modified. For HTTP an If-Modified-Since HTTP header is sent. For FTP a MTDM command is sent first and compared locally. For FILE the source file is compared.
You the http header you need to check is If Modified Since
Had the mail in response to Dan open the whole testing and spooted your mail after i send. I checked the lighttpd behaviour again: 1. try: Last-Modified: Mon, 12 Oct 2009 13:39:50 GMT a touch on the file later on 2. try: Last-Modified: Mon, 12 Oct 2009 14:41:48 GMT Looks sane. Might be that lighttpd headers are distinct from apache httpd. The only change is that lighttpd sends the local time header "Date" after the "Last-Modified" header. Could it be a bug regarding ordering of Header lines in libfetch? Have no gdb ready on this box, but will look into it some more tomorrow when i get to it. Marc
On Mon, Oct 12, 2009 at 4:54 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Am Montag, den 12.10.2009, 16:36 +0200 schrieb Xavier:
On Mon, Oct 12, 2009 at 4:12 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Well it could be a local problem with my environment of course, but shouldn't 3.3.0 then suffer from the same problem then? As there were some fiddling with the libfetch download code in pacman on version 3.3.1 and 3.3.2 i thought it might be related to that.
I already answered this question, see my previous mail. And from libfetch man page : If the `i' (if-modified-since) flag is specified, the library will try to fetch the content only if it is newer than last_modified. For HTTP an If-Modified-Since HTTP header is sent. For FTP a MTDM command is sent first and compared locally. For FILE the source file is compared.
You the http header you need to check is If Modified Since
Had the mail in response to Dan open the whole testing and spooted your mail after i send.
I checked the lighttpd behaviour again:
1. try: Last-Modified: Mon, 12 Oct 2009 13:39:50 GMT
a touch on the file later on 2. try: Last-Modified: Mon, 12 Oct 2009 14:41:48 GMT
Looks sane.
Might be that lighttpd headers are distinct from apache httpd. The only change is that lighttpd sends the local time header "Date" after the "Last-Modified" header. Could it be a bug regarding ordering of Header lines in libfetch?
Have no gdb ready on this box, but will look into it some more tomorrow when i get to it.
Marc
The purpose of this feature is to allow efficient updates of cached information with a minimum amount of transaction overhead. Note: The Range request-header field modifies the meaning of If- Modified-Since; see section 14.35 for full details. Note: If-Modified-Since times are interpreted by the server, whose clock might not be synchronized with the client. Note: When handling an If-Modified-Since header field, some servers will use an exact date comparison function, rather than a less-than function, for deciding whether to send a 304 (Not Modified) response. To get best results when sending an If- Modified-Since header field for cache validation, clients are advised to use the exact date string received in a previous Last- Modified header field whenever possible. Note: If a client uses an arbitrary date in the If-Modified-Since header instead of a date taken from the Last-Modified header for the same request, the client should be aware of the fact that this date is interpreted in the server's understanding of time. The client should consider unsynchronized clocks and rounding problems due to the different encodings of time between the client and server. This includes the possibility of race conditions if the document has changed between the time it was first requested and the If-Modified-Since date of a subsequent request, and the possibility of clock-skew-related problems if the If-Modified- Since date is derived from the client's clock without correction to the server's clock. Corrections for different time bases between client and server are at best approximate due to network latency. Are your client and server on two different box ? If so, check their clocks. Anyway we are using ust.mtime from libfetch. I would think this comes from http Last-Modified and thus should not cause problems. But I am not sure. I am still trying to understand how this stuff works :)
Am Montag, den 12.10.2009, 17:06 +0200 schrieb Xavier:
On Mon, Oct 12, 2009 at 4:54 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Am Montag, den 12.10.2009, 16:36 +0200 schrieb Xavier:
On Mon, Oct 12, 2009 at 4:12 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Well it could be a local problem with my environment of course, but shouldn't 3.3.0 then suffer from the same problem then? As there were some fiddling with the libfetch download code in pacman on version 3.3.1 and 3.3.2 i thought it might be related to that.
I already answered this question, see my previous mail. And from libfetch man page : If the `i' (if-modified-since) flag is specified, the library will try to fetch the content only if it is newer than last_modified. For HTTP an If-Modified-Since HTTP header is sent. For FTP a MTDM command is sent first and compared locally. For FILE the source file is compared.
You the http header you need to check is If Modified Since
Had the mail in response to Dan open the whole testing and spooted your mail after i send.
I checked the lighttpd behaviour again:
1. try: Last-Modified: Mon, 12 Oct 2009 13:39:50 GMT
a touch on the file later on 2. try: Last-Modified: Mon, 12 Oct 2009 14:41:48 GMT
Looks sane.
Might be that lighttpd headers are distinct from apache httpd. The only change is that lighttpd sends the local time header "Date" after the "Last-Modified" header. Could it be a bug regarding ordering of Header lines in libfetch?
Have no gdb ready on this box, but will look into it some more tomorrow when i get to it.
Marc
The purpose of this feature is to allow efficient updates of cached information with a minimum amount of transaction overhead.
Note: The Range request-header field modifies the meaning of If- Modified-Since; see section 14.35 for full details.
Note: If-Modified-Since times are interpreted by the server, whose clock might not be synchronized with the client.
Note: When handling an If-Modified-Since header field, some servers will use an exact date comparison function, rather than a less-than function, for deciding whether to send a 304 (Not Modified) response. To get best results when sending an If- Modified-Since header field for cache validation, clients are advised to use the exact date string received in a previous Last- Modified header field whenever possible.
Note: If a client uses an arbitrary date in the If-Modified-Since header instead of a date taken from the Last-Modified header for the same request, the client should be aware of the fact that this date is interpreted in the server's understanding of time. The client should consider unsynchronized clocks and rounding problems due to the different encodings of time between the client and server. This includes the possibility of race conditions if the document has changed between the time it was first requested and the If-Modified-Since date of a subsequent request, and the
possibility of clock-skew-related problems if the If-Modified- Since date is derived from the client's clock without correction to the server's clock. Corrections for different time bases between client and server are at best approximate due to network latency.
Are your client and server on two different box ? If so, check their clocks.
They are on different hosts and different carrier networks. They get time updates via ntp from the same source. System times are sane on this boxes.
Anyway we are using ust.mtime from libfetch. I would think this comes from http Last-Modified and thus should not cause problems. But I am not sure. I am still trying to understand how this stuff works :)
Thanks Xavier, this information was what was missing in this puzzle for me. I sniffed the http headers with tshark and checked what pacman asks for per header and what lighttpd is sending back to pacman in response. This is definitely a lighttpd bug. It answers with "304 not modified" response even if the file was modified because of some missing checks. Sorry for the noise. If anybody is interested, this is the corresponding fix for lighttpd-1.4.23 which will be in 1.4.24 when it gets released: http://redmine.lighttpd.net/projects/lighttpd/repository/revisions/2643/diff/branches/lighttpd-1.4.x/src/http-header-glue.c?format=diff&rev_to=2408 Marc
On Mon, Oct 12, 2009 at 6:00 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
They are on different hosts and different carrier networks. They get time updates via ntp from the same source. System times are sane on this boxes.
Anyway we are using ust.mtime from libfetch. I would think this comes from http Last-Modified and thus should not cause problems. But I am not sure. I am still trying to understand how this stuff works :)
Thanks Xavier, this information was what was missing in this puzzle for me. I sniffed the http headers with tshark and checked what pacman asks for per header and what lighttpd is sending back to pacman in response. This is definitely a lighttpd bug. It answers with "304 not modified" response even if the file was modified because of some missing checks.
Sorry for the noise.
If anybody is interested, this is the corresponding fix for lighttpd-1.4.23 which will be in 1.4.24 when it gets released:
Ah cool, glad to know it's already figured out :) Wasn't it this bug : http://redmine.lighttpd.net/issues/2047 The patch linked there look like a subset of your link above. But that makes sense because the revision fixing the issue seems to be 2608 or 2609 and your link above is a diff between 2408 and 2643 And don't worry about the noise, it's interesting to know that pacman doesn't play nicely since 3.3.1 with lighttpd-1.4.23 because of a bug.
Am Montag, den 12.10.2009, 18:15 +0200 schrieb Xavier:
On Mon, Oct 12, 2009 at 6:00 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
They are on different hosts and different carrier networks. They get time updates via ntp from the same source. System times are sane on this boxes.
Anyway we are using ust.mtime from libfetch. I would think this comes from http Last-Modified and thus should not cause problems. But I am not sure. I am still trying to understand how this stuff works :)
Thanks Xavier, this information was what was missing in this puzzle for me. I sniffed the http headers with tshark and checked what pacman asks for per header and what lighttpd is sending back to pacman in response. This is definitely a lighttpd bug. It answers with "304 not modified" response even if the file was modified because of some missing checks.
Sorry for the noise.
If anybody is interested, this is the corresponding fix for lighttpd-1.4.23 which will be in 1.4.24 when it gets released:
Ah cool, glad to know it's already figured out :) Wasn't it this bug : http://redmine.lighttpd.net/issues/2047
Yes, that's the one.
The patch linked there look like a subset of your link above. But that makes sense because the revision fixing the issue seems to be 2608 or 2609 and your link above is a diff between 2408 and 2643
And don't worry about the noise, it's interesting to know that pacman doesn't play nicely since 3.3.1 with lighttpd-1.4.23 because of a bug.
Got that, thanks for your help with this problem. By the way, i'll post the repo-delta-clean stuff sometime this week after i got around to make a final set of tests. Marc
On Mon, Oct 12, 2009 at 11:32 AM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Am Montag, den 12.10.2009, 18:15 +0200 schrieb Xavier:
On Mon, Oct 12, 2009 at 6:00 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
They are on different hosts and different carrier networks. They get time updates via ntp from the same source. System times are sane on this boxes.
Anyway we are using ust.mtime from libfetch. I would think this comes from http Last-Modified and thus should not cause problems. But I am not sure. I am still trying to understand how this stuff works :)
Thanks Xavier, this information was what was missing in this puzzle for me. I sniffed the http headers with tshark and checked what pacman asks for per header and what lighttpd is sending back to pacman in response. This is definitely a lighttpd bug. It answers with "304 not modified" response even if the file was modified because of some missing checks.
Sorry for the noise.
If anybody is interested, this is the corresponding fix for lighttpd-1.4.23 which will be in 1.4.24 when it gets released:
Ah cool, glad to know it's already figured out :) Wasn't it this bug : http://redmine.lighttpd.net/issues/2047
Yes, that's the one.
The patch linked there look like a subset of your link above. But that makes sense because the revision fixing the issue seems to be 2608 or 2609 and your link above is a diff between 2408 and 2643
And don't worry about the noise, it's interesting to know that pacman doesn't play nicely since 3.3.1 with lighttpd-1.4.23 because of a bug.
Got that, thanks for your help with this problem.
By the way, i'll post the repo-delta-clean stuff sometime this week after i got around to make a final set of tests.
Good work tracking this down, guys. Sorry if I sounded a bit off-putting this morning, I'm just not a fan of bug reports without details and I've been seeing a lot of those lately (mostly at my real job). :) -Dan
On Mon, Oct 12, 2009 at 2:14 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Hello List,
pacman 3.3.2 libfetch backend doesn't retrieve an updated repo via http (tested with "touch repo.db.tar.gz" on the http server) here.
I checked the http headers with curl just to be sure that my setup isn't f***ed up in some way and they are updated just fine after every touch on the file, so i has to be a problem with libfetch or pacman itself.
A local revert to pacman 3.3.0 (no build-changes or libfetch changes and the exact same build-environment) fixes this problem here so it looks like a messed up usage of libfetch in pacman 3.3.2 release.
Can someone else reproduce this behaviour?
Because i have no clue on how to use libfetch the right way without reading the documentation and analyse the code in pacman afterwards, i hope someone else wants to hound it down.
There was definitely a change between 3.3.0 and 3.3.1 : http://projects.archlinux.org/?p=pacman.git;a=commitdiff;h=6f97842ab22eb50fd... Just to be sure, you could try reverting just that commit on 3.3.2. But then, it seems to work for most users, using all the official mirrors. So I am not sure what is wrong. Maybe your http server is broken and does not handle/report Last-modified correctly. I believe you could check this header using wireshark or another network analyzer. You could also try using a different http server.
participants (3)
-
Dan McGee
-
Marc - A. Dahlhaus [ Administration | Westermann GmbH ]
-
Xavier