[pacman-dev] [PATCH] libalpm: Do a sanity check before manipulating final DB URL
The change in commit 9d96bed9d6b57 causes download errors for the .db.sig file in case the final URL for the .db file contains query strings or other unexpected stuff. This commit isn't intended to be a total solution, but it should eliminate the problem in the most obvious cases. --- Please comment. Does anyone know if there's a case when the final URL doesn't end in ".db", but it's still reasonable to append ".sig" to it? Is there a case when the final URL ends with ".db.tar.xz", for example? lib/libalpm/be_sync.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 20130dc..f30698e 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -241,20 +241,27 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) unlink(sigpath); free(sigpath); - /* if we downloaded a DB, we want the .sig from the same server - - this information is only available from the internal downloader */ - if(handle->fetchcb == NULL) { + + /* check if the final URL from internal downloader looks reasonable */ + if(final_db_url != NULL) { + if(strlen(final_db_url) < 3 || strcmp(final_db_url + strlen(final_db_url) - 3, ".db") != 0) { + final_db_url = NULL; + } + } + + /* if we downloaded a DB, we want the .sig from the same server */ + if(final_db_url != NULL) { /* print final_db_url into a buffer (leave space for .sig) */ len = strlen(final_db_url) + 5; } else { - /* print server + filename into a buffer (leave space for .sig) */ + /* print server + filename into a buffer (leave space for .db.sig) */ len = strlen(server) + strlen(db->treename) + 9; } /* TODO fix leak syncpath and umask unset */ MALLOC(payload.fileurl, len, RET_ERR(handle, ALPM_ERR_MEMORY, -1)); - if(handle->fetchcb == NULL) { + if(final_db_url != NULL) { snprintf(payload.fileurl, len, "%s.sig", final_db_url); } else { snprintf(payload.fileurl, len, "%s/%s.db.sig", server, db->treename); -- 2.3.5 -- David Macek
On 13.04.2015 21:33, David Macek wrote:
diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 20130dc..f30698e 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -241,20 +241,27 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) unlink(sigpath); free(sigpath);
- /* if we downloaded a DB, we want the .sig from the same server - - this information is only available from the internal downloader */ - if(handle->fetchcb == NULL) { + + /* check if the final URL from internal downloader looks reasonable */ + if(final_db_url != NULL) { + if(strlen(final_db_url) < 3 || strcmp(final_db_url + strlen(final_db_url) - 3, ".db") != 0) { + final_db_url = NULL; + } + } + + /* if we downloaded a DB, we want the .sig from the same server */ + if(final_db_url != NULL) { /* print final_db_url into a buffer (leave space for .sig) */ len = strlen(final_db_url) + 5; } else { - /* print server + filename into a buffer (leave space for .sig) */ + /* print server + filename into a buffer (leave space for .db.sig) */ len = strlen(server) + strlen(db->treename) + 9;
Comment changed, but not the code? If you know where that + 9 comes from it might be a good idea to replace it with strlen("whatever") which the compile should optimize out later, but writing it this way makes the whole thing a lot clearer. I'll let someone else comment on the rest of the patch.
On 16. 4. 2015 21:36, Florian Pritz wrote:
On 13.04.2015 21:33, David Macek wrote:
+ /* if we downloaded a DB, we want the .sig from the same server */ + if(final_db_url != NULL) { /* print final_db_url into a buffer (leave space for .sig) */ len = strlen(final_db_url) + 5; } else { - /* print server + filename into a buffer (leave space for .sig) */ + /* print server + filename into a buffer (leave space for .db.sig) */ len = strlen(server) + strlen(db->treename) + 9;
Comment changed, but not the code?
The comment was copied when the code was split into the two cases and the parenthesized part apparently wasn't correctly updated.
If you know where that + 9 comes from it might be a good idea to replace it with strlen("whatever") which the compile should optimize out later,
I didn't know gcc could do that, thanks for the tip.
but writing it this way makes the whole thing a lot clearer.
First off, isn't this change better suited for a separate patch? If yes -- in order to speed up the potential merge -- the comment change can be removed from this patch and we can talk about the lengths in the context of another patch. The 9 there is for "/", ".db.sig", and the null terminator. Would this be a good way of re-writing the line? strlen(server) + strlen("/") + strlen(db->treename) + strlen(".db.sig") + 1; Warning: The following ideas might be too academic for this situation. Since the strlen's are just duplicating the information contained in the format string, wouldn't it be better to use snsprintf to count the buffer length? I imagine the flow would be like this: format = "..."; len = snprintf(NULL, 0, format, ...) + 1; MALLOC(payload.fileurl, len, RET_ERR(handle, ALPM_ERR_MEMORY, -1)); snprintf(payload.fileurl, len, format, ...); Only I don't know how to make it work nice with the conditional code. Consider this: format = final_db_url != NULL ? "%1$s.sig" : "%2$s/%3$s.db.sig"; len = snprintf(NULL, 0, format, final_db_url, server, db->treename) + 1; MALLOC(payload.fileurl, len, RET_ERR(handle, ALPM_ERR_MEMORY, -1)); snprintf(payload.fileurl, len, format, final_db_url, server, db->treename); This code is almost DRY, but various sources say that skipping parameters in *printf is bad, so that doesn't work either. Ideas? -- David Macek
On 17/04/15 06:34, David Macek wrote:
On 16. 4. 2015 21:36, Florian Pritz wrote:
On 13.04.2015 21:33, David Macek wrote:
+ /* if we downloaded a DB, we want the .sig from the same server */ + if(final_db_url != NULL) { /* print final_db_url into a buffer (leave space for .sig) */ len = strlen(final_db_url) + 5; } else { - /* print server + filename into a buffer (leave space for .sig) */ + /* print server + filename into a buffer (leave space for .db.sig) */ len = strlen(server) + strlen(db->treename) + 9;
Comment changed, but not the code?
The comment was copied when the code was split into the two cases and the parenthesized part apparently wasn't correctly updated.
If you know where that + 9 comes from it might be a good idea to replace it with strlen("whatever") which the compile should optimize out later,
I didn't know gcc could do that, thanks for the tip.
but writing it this way makes the whole thing a lot clearer.
First off, isn't this change better suited for a separate patch? If yes -- in order to speed up the potential merge -- the comment change can be removed from this patch and we can talk about the lengths in the context of another patch.
Just adjust the comment in a separate patch.
The 9 there is for "/", ".db.sig", and the null terminator. Would this be a good way of re-writing the line?
strlen(server) + strlen("/") + strlen(db->treename) + strlen(".db.sig") + 1;
Just make the comment "leave space for separator and .db.sig" A
The change in commit 9d96bed9d6b57 causes download errors for the .db.sig file in case the final URL for the .db file contains query strings or other unexpected stuff. This commit isn't intended to be a total solution, but it should eliminate the problem in the most obvious cases. --- lib/libalpm/be_sync.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 20130dc..606c4a0 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -241,9 +241,16 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) unlink(sigpath); free(sigpath); - /* if we downloaded a DB, we want the .sig from the same server - - this information is only available from the internal downloader */ - if(handle->fetchcb == NULL) { + + /* check if the final URL from internal downloader looks reasonable */ + if(final_db_url != NULL) { + if(strlen(final_db_url) < 3 || strcmp(final_db_url + strlen(final_db_url) - 3, ".db") != 0) { + final_db_url = NULL; + } + } + + /* if we downloaded a DB, we want the .sig from the same server */ + if(final_db_url != NULL) { /* print final_db_url into a buffer (leave space for .sig) */ len = strlen(final_db_url) + 5; } else { @@ -254,7 +261,7 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) /* TODO fix leak syncpath and umask unset */ MALLOC(payload.fileurl, len, RET_ERR(handle, ALPM_ERR_MEMORY, -1)); - if(handle->fetchcb == NULL) { + if(final_db_url != NULL) { snprintf(payload.fileurl, len, "%s.sig", final_db_url); } else { snprintf(payload.fileurl, len, "%s/%s.db.sig", server, db->treename); -- 2.3.5 -- David Macek
Fix comment to better explain the magic constant used when allocating a buffer for ".db.sig" URL. --- lib/libalpm/be_sync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 606c4a0..ea979e6 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -254,7 +254,7 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) /* print final_db_url into a buffer (leave space for .sig) */ len = strlen(final_db_url) + 5; } else { - /* print server + filename into a buffer (leave space for .sig) */ + /* print server + filename into a buffer (leave space for separator and .db.sig) */ len = strlen(server) + strlen(db->treename) + 9; } -- 2.3.5 -- David Macek
On Thu, Apr 16, 2015 at 9:36 PM, Florian Pritz <bluewind@xinu.at> wrote:
Comment changed, but not the code? If you know where that + 9 comes from it might be a good idea to replace it with strlen("whatever") which the compile should optimize out later, but writing it this way makes the whole thing a lot clearer.
Do the pacman developers trust the compiler with optimizing strlen()? Because there's sizeof which does a perfectly predictable (off-by-one) job using but an operator for measuring the length of a string constant. Furthermore it does so without implying the compiler would sidestep the standard library's work in a way or another... Who says you can't have a libc "obscurelibc" that does something else there? A preprocessor macro branching on (#x == '"') or (sizeof(x) == sizeof(char*)) would be the less beautiful alternative, but I think I'm risking to be yelled back at that way. :-) cheers! mar77i
On 19/04/15 00:25, Martti Kühne wrote:
On Thu, Apr 16, 2015 at 9:36 PM, Florian Pritz <bluewind@xinu.at> wrote:
Comment changed, but not the code? If you know where that + 9 comes from it might be a good idea to replace it with strlen("whatever") which the compile should optimize out later, but writing it this way makes the whole thing a lot clearer.
Do the pacman developers trust the compiler with optimizing strlen()?
Yes - we let compliers do their job, which they are remarkably good at.
Because there's sizeof which does a perfectly predictable (off-by-one) job using but an operator for measuring the length of a string constant. Furthermore it does so without implying the compiler would sidestep the standard library's work in a way or another... Who says you can't have a libc "obscurelibc" that does something else there?
Many standards ensure that strlen() does what it should.
A preprocessor macro branching on (#x == '"') or (sizeof(x) == sizeof(char*)) would be the less beautiful alternative, but I think I'm risking to be yelled back at that way. :-)
The risk is strong!
On 18/04/15 01:31, David Macek wrote:
The change in commit 9d96bed9d6b57 causes download errors for the .db.sig file in case the final URL for the .db file contains query strings or other unexpected stuff. This commit isn't intended to be a total solution, but it should eliminate the problem in the most obvious cases. --- lib/libalpm/be_sync.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 20130dc..606c4a0 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -241,9 +241,16 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) unlink(sigpath); free(sigpath);
- /* if we downloaded a DB, we want the .sig from the same server - - this information is only available from the internal downloader */ - if(handle->fetchcb == NULL) { + + /* check if the final URL from internal downloader looks reasonable */ + if(final_db_url != NULL) { + if(strlen(final_db_url) < 3 || strcmp(final_db_url + strlen(final_db_url) - 3, ".db") != 0) { + final_db_url = NULL; + } + } + + /* if we downloaded a DB, we want the .sig from the same server */ + if(final_db_url != NULL) {
I am fairly certain this is OK... but there is a nagging feeling that I am missing something in the change: - if(handle->fetchcb == NULL) { + if(final_db_url != NULL) { @Dave: any chance you could take a very quick glance at this?
/* print final_db_url into a buffer (leave space for .sig) */ len = strlen(final_db_url) + 5; } else { @@ -254,7 +261,7 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) /* TODO fix leak syncpath and umask unset */ MALLOC(payload.fileurl, len, RET_ERR(handle, ALPM_ERR_MEMORY, -1));
- if(handle->fetchcb == NULL) { + if(final_db_url != NULL) { snprintf(payload.fileurl, len, "%s.sig", final_db_url); } else { snprintf(payload.fileurl, len, "%s/%s.db.sig", server, db->treename);
On 12/05/15 13:56, Allan McRae wrote:
On 18/04/15 01:31, David Macek wrote:
The change in commit 9d96bed9d6b57 causes download errors for the .db.sig file in case the final URL for the .db file contains query strings or other unexpected stuff. This commit isn't intended to be a total solution, but it should eliminate the problem in the most obvious cases. --- lib/libalpm/be_sync.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 20130dc..606c4a0 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -241,9 +241,16 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db) unlink(sigpath); free(sigpath);
- /* if we downloaded a DB, we want the .sig from the same server - - this information is only available from the internal downloader */ - if(handle->fetchcb == NULL) { + + /* check if the final URL from internal downloader looks reasonable */ + if(final_db_url != NULL) { + if(strlen(final_db_url) < 3 || strcmp(final_db_url + strlen(final_db_url) - 3, ".db") != 0) { + final_db_url = NULL; + } + } + + /* if we downloaded a DB, we want the .sig from the same server */ + if(final_db_url != NULL) {
I am fairly certain this is OK... but there is a nagging feeling that I am missing something in the change:
- if(handle->fetchcb == NULL) { + if(final_db_url != NULL) {
@Dave: any chance you could take a very quick glance at this?
Ignore that - the patch is fine. A
participants (4)
-
Allan McRae
-
David Macek
-
Florian Pritz
-
Martti Kühne