[pacman-dev] MD5/SHA* why?
I asked this question a while ago about makepkg now I'm asking about pacman... why do we need support for multiple checksum types? What's wrong with md5? Andrew
Tuesday 03 of July 2007 21:40:17 Andrew Fyfe napisał(a):
I asked this question a while ago about makepkg now I'm asking about pacman... why do we need support for multiple checksum types? What's wrong with md5?
Andrew
_______________________________________________ pacman-dev mailing list pacman-dev@archlinux.org http://archlinux.org/mailman/listinfo/pacman-dev
It's broken ;) Not that valid maybe and important when it comes to package corruption checks, but certainly it has been already proven crackable. And also, it wouldn't hurt I guess to use both. Most modern CPU's are good for it. And, when all else fails, there's the ground statement - everyone else's doing it! ;-) And by everyone else I mean ports/pkgsrc as they are the only other package management systems I use. Cheers, //m. -- Mateusz Jędrasik <m.jedrasik@gmail.com> tel. +48(79)022-9393, +48(51)69-444-90 http://imachine.szklo.eu.org
On 7/3/07, Mateusz Jedrasik <m.jedrasik@gmail.com> wrote:
Tuesday 03 of July 2007 21:40:17 Andrew Fyfe napisał(a):
I asked this question a while ago about makepkg now I'm asking about pacman... why do we need support for multiple checksum types? What's wrong with md5?
The problem with MD5 (and recently SHA1) is that you can find collisions relatively quickly on a powerful machine (under a day in some cases). Thus if you found the correct collision that actually was a valid tarball, that had valid files in it, and one of those files had something malicious in it, you would be in trouble. I mean, the chances are close to zero, but md5 has gotten a lot of press on how "crackable" it is. SHA1 is crackable as well, thought not as easily. Now put BOTH sums in your PKGBUILD. Now some third party would have to find all the collisions for MD5 and SHA1, make sure they create the same sums as those in the package, and then they would have to see if that was even any data that could be used for something malicious. I suggest using both MD5 and SHA1. I seriously doubt there is a single situation where this would not be enough for validating the package. Though I think we should move to signing our packages, so we actually have security along with validation... // codemac -- . : [ + carpe diem totus tuus + ] : .
2007/7/3, Jeff Mickey <jeff@archlinux.org>:
On 7/3/07, Mateusz Jedrasik <m.jedrasik@gmail.com> wrote:
Tuesday 03 of July 2007 21:40:17 Andrew Fyfe napisał(a):
I asked this question a while ago about makepkg now I'm asking about pacman... why do we need support for multiple checksum types? What's wrong with md5?
The problem with MD5 (and recently SHA1) is that you can find collisions relatively quickly on a powerful machine (under a day in some cases). Thus if you found the correct collision that actually was a valid tarball, that had valid files in it, and one of those files had something malicious in it, you would be in trouble. I mean, the chances are close to zero, but md5 has gotten a lot of press on how "crackable" it is. SHA1 is crackable as well, thought not as easily.
Note what Jason said there : http://www.archlinux.org/pipermail/pacman-dev/2006-October/005990.html "Most of the ones I've seen talked about creating md5 collisions between two files, not creating a file with the same md5 as another file (there's a distinction)." The numbers you gave are for which case ? But even without talking about that, like you already said, it looks indeed very unlikely this could be exploitable...
Now put BOTH sums in your PKGBUILD. Now some third party would have to find all the collisions for MD5 and SHA1, make sure they create the same sums as those in the package, and then they would have to see if that was even any data that could be used for something malicious.
I suggest using both MD5 and SHA1. I seriously doubt there is a single situation where this would not be enough for validating the package.
Heh, we already seriously doubt there is a single situation where MD5 wouldn't be enough, so what does this add exactly ? If we are going to be completely paranoid, then why not using ONE algorithm that hasn't been cracked yet ?
Though I think we should move to signing our packages, so we actually have security along with validation...
Now that's probably a better suggestion, and there is at least already a FR for it :) http://bugs.archlinux.org/task/5331
Mateusz Jedrasik wrote:
And, when all else fails, there's the ground statement - everyone else's doing it! ;-) The worst excuse ever :p
Dan McGee wrote:
On 7/3/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
I asked this question a while ago about makepkg now I'm asking about pacman... why do we need support for multiple checksum types? What's wrong with md5?
Frugalware switched to using sha1sums, so that is a big reason. If you can find a good reason to pull support for it, let us know, but as of right now I don't see a reason to remove it, even if we don't use it. Any other distribution using pacman may want to use it, so the option is there.
-Dan
My problem is more with the fact that we have 5 functions and 1 field in pmpkg_t for each checksum and we have to do if checksum = x then x_foo(); else if checksum = y then y_foo(); else if checksum = z then z_foo(); fi every time we want to do something with a checksum. For now I'll just treat it the same as makepkg... cause we can :D Andrew
Oh no, when reading the archives, I forgot to bookmark several important mails, took me a while to find this one back : http://www.archlinux.org/pipermail/pacman-dev/2006-October/006029.html So that's Judd opinion on that matter: "I never pretended that md5 was for anything security-related. If we were trying for security, we would've gone straight to signed packages. The md5sum was added to make sure downloaded files weren't corrupt. I don't see the point of SHA1 if we're still using it/them for download validation. If we want security, then we might as well do it right." As for my opinion on this, it's exactly the same as Andrew, it complicates the code for 0 benefit...
Xavier wrote:
Oh no, when reading the archives, I forgot to bookmark several important mails, took me a while to find this one back : http://www.archlinux.org/pipermail/pacman-dev/2006-October/006029.html So that's Judd opinion on that matter: "I never pretended that md5 was for anything security-related. If we were trying for security, we would've gone straight to signed packages. The md5sum was added to make sure downloaded files weren't corrupt.
I don't see the point of SHA1 if we're still using it/them for download validation. If we want security, then we might as well do it right."
As for my opinion on this, it's exactly the same as Andrew, it complicates the code for 0 benefit...
I fully agree with Judd's comment, using MD5 or SHA1 for security is plain stupid all we went a checksum for is a basic check that the package we've downloaded isn't corrupt. What are the odds you could download a corrupt package with the same checksum as the valid package? My preference would be to stick with 1 checksum (preferably MD5 as that's what's mainly used in Arch at the moment), and remove the other to simplify the code.... K.I.S.S. Andrew
On 7/4/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
Xavier wrote:
Oh no, when reading the archives, I forgot to bookmark several important mails, took me a while to find this one back : http://www.archlinux.org/pipermail/pacman-dev/2006-October/006029.html So that's Judd opinion on that matter: "I never pretended that md5 was for anything security-related. If we were trying for security, we would've gone straight to signed packages. The md5sum was added to make sure downloaded files weren't corrupt.
I don't see the point of SHA1 if we're still using it/them for download validation. If we want security, then we might as well do it right."
As for my opinion on this, it's exactly the same as Andrew, it complicates the code for 0 benefit...
I fully agree with Judd's comment, using MD5 or SHA1 for security is plain stupid all we went a checksum for is a basic check that the package we've downloaded isn't corrupt. What are the odds you could download a corrupt package with the same checksum as the valid package?
My preference would be to stick with 1 checksum (preferably MD5 as that's what's mainly used in Arch at the moment), and remove the other to simplify the code.... K.I.S.S.
Patches welcome for this. If anyone wants to start looking into package signing as well, then more power to you. I also dislike the fact that we have 3 different files for the md5 stuff- md5driver.c, md5.c, and md5.h. We should be able to move this code all into a C file and header, md5.c and md5.h, without difficulties. Make this a separate patch though. -Dan
On Wed, Jul 04, 2007 at 11:46:49PM -0400, Dan McGee wrote:
On 7/4/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
Xavier wrote:
Oh no, when reading the archives, I forgot to bookmark several important mails, took me a while to find this one back : http://www.archlinux.org/pipermail/pacman-dev/2006-October/006029.html So that's Judd opinion on that matter: "I never pretended that md5 was for anything security-related. If we were trying for security, we would've gone straight to signed packages. The md5sum was added to make sure downloaded files weren't corrupt.
I don't see the point of SHA1 if we're still using it/them for download validation. If we want security, then we might as well do it right."
As for my opinion on this, it's exactly the same as Andrew, it complicates the code for 0 benefit...
I fully agree with Judd's comment, using MD5 or SHA1 for security is plain stupid all we went a checksum for is a basic check that the package we've downloaded isn't corrupt. What are the odds you could download a corrupt package with the same checksum as the valid package?
My preference would be to stick with 1 checksum (preferably MD5 as that's what's mainly used in Arch at the moment), and remove the other to simplify the code.... K.I.S.S.
Patches welcome for this. If anyone wants to start looking into package signing as well, then more power to you.
I also dislike the fact that we have 3 different files for the md5 stuff- md5driver.c, md5.c, and md5.h. We should be able to move this code all into a C file and header, md5.c and md5.h, without difficulties. Make this a separate patch though.
I was the main person pushing for this and it was mostly for the malicious downloads. It's not the package downloading that I was worried about as much as the source tarballs. We use md5sums to make sure that the tarball we downloaded building the package is the same as the tarball that the developer used when they built the package. If someone gets access to the upstream's server, we're using the md5sum to trust files over time. I had long discussions with Aaron about this. He still wasn't convinced but added it because it didn't hurt. Eventually we decided that the best bet was to store source packages on the arch servers, because then we could trust those. That just hasn't happened yet. Obviously there are way more people who think this is dumb than I do. I wrote the original patch for this (makepkg only) during LinuxTag 2005 after JGC mentioned that BSD uses two hashes. I thought their reasoning was sound so I wanted to do it too. Jason
On Thu, Jul 05, 2007 at 02:06:09PM -0700, Jason Chu wrote:
I was the main person pushing for this and it was mostly for the malicious downloads.
It's not the package downloading that I was worried about as much as the source tarballs. We use md5sums to make sure that the tarball we downloaded building the package is the same as the tarball that the developer used when they built the package. If someone gets access to the upstream's server, we're using the md5sum to trust files over time.
Oh I see. But what I am really wondering is why combining two existing algorithms that have flaws instead of using one for which no flaw has been found yet ? Isn't it both less secure and more complicated ?
On Fri, Jul 06, 2007 at 12:20:00AM +0200, Xavier wrote:
On Thu, Jul 05, 2007 at 02:06:09PM -0700, Jason Chu wrote:
I was the main person pushing for this and it was mostly for the malicious downloads.
It's not the package downloading that I was worried about as much as the source tarballs. We use md5sums to make sure that the tarball we downloaded building the package is the same as the tarball that the developer used when they built the package. If someone gets access to the upstream's server, we're using the md5sum to trust files over time.
Oh I see. But what I am really wondering is why combining two existing algorithms that have flaws instead of using one for which no flaw has been found yet ? Isn't it both less secure and more complicated ?
We are at an inroads in hashing algorithm theory. All the current hashing algorithms have flaws. It's also likely that any new hash algorithms will have flaws as well. If we just trusted md5s or sha1s, then it would be less secure and more complicated, but because we look at both md5s and sha1s *together* that things improve. An analogy, think of two sheets with holes in them. You can look through each sheet and see the light on the other side, but if you lay the two sheets on top of each other a lot less light is visible. Because we're considering both hashing algorithms they cover some of the other's failings. I'm all for making less complication though... maybe a more abstract hash API? Jason
On Thu, Jul 05, 2007 at 03:42:42PM -0700, Jason Chu wrote:
We are at an inroads in hashing algorithm theory. All the current hashing algorithms have flaws. It's also likely that any new hash algorithms will have flaws as well.
Maybe the information I had is already outdated, since all this stuff moves pretty quickly :) What are the flaws of all the SHA-224/256/384/512 hashes ? see this for example : http://en.wikipedia.org/wiki/SHA-1#SHA_sizes Or are these the new algorithms ? They could indeed have flaws as well, but still say more secure than the current ones, even after flaws are found.
If we just trusted md5s or sha1s, then it would be less secure and more complicated, but because we look at both md5s and sha1s *together* that things improve.
I'm not convinced that 1) md5 or sha1 alone aren't enough secure (for our use case) 2) combining md5 and sha1 is better than eg SHA-256
An analogy, think of two sheets with holes in them. You can look through each sheet and see the light on the other side, but if you lay the two sheets on top of each other a lot less light is visible. Because we're considering both hashing algorithms they cover some of the other's failings.
In that case, you move both holes so that they match (with padding) :) But yes, that's still the general case, not pacman one.
I'm all for making less complication though... maybe a more abstract hash API?
If we need to keep several hashing algorithm, I think this would be great.
On 7/5/07, Xavier <shiningxc@gmail.com> wrote:
On Thu, Jul 05, 2007 at 02:06:09PM -0700, Jason Chu wrote:
I was the main person pushing for this and it was mostly for the malicious downloads.
It's not the package downloading that I was worried about as much as the source tarballs. We use md5sums to make sure that the tarball we downloaded building the package is the same as the tarball that the developer used when they built the package. If someone gets access to the upstream's server, we're using the md5sum to trust files over time.
Oh I see. But what I am really wondering is why combining two existing algorithms that have flaws instead of using one for which no flaw has been found yet ? Isn't it both less secure and more complicated ?
<offtopic> Every possible hashing algorithm has flaws. However, they need to be exploitable for them to be of any use. Just because a flaw hasn't been found doesn't mean there isn't one. And I think the very important point was missed above in all these emails- creating a useful 'flaw' is not easy at all. Let's first use the example of one hashing function, such as MD5. First, you have the original file's hash. In all flaw finding exercises, this is not the case- all they look for is for two hashes to match, not trying to match one to a preexisting hash. So we are already off to a hard challenge here. Say you are able to find some other junk data that hashes to the same value. Sorry- that is worthless. You need valid data. Now add a second hash. You will need a *double* collision of data- one where *both* hashes are the same for the valid data and the malicious data. I dare to say impossible. </offtopic> One hash being more secure? Doubtful. Maybe about the same. One hash being less complicated? Do you like dealing with 80 character strings? -Dan
On Thu, Jul 05, 2007 at 06:45:00PM -0400, Dan McGee wrote:
<offtopic> Every possible hashing algorithm has flaws. However, they need to be exploitable for them to be of any use. Just because a flaw hasn't been found doesn't mean there isn't one. And I think the very important point was missed above in all these emails- creating a useful 'flaw' is not easy at all.
What do you mean? A hashing algorithm is supposed to bring a certain level of security. When flaws are found, it means it doesn't even bring that initial level anymore, but a lower one, for example, finding collisions will require less computing powers than what was initally expected. Afaik, flaws were found for MD5, SHA-0 and SHA-1, but not for all SHA-* successors. Obviously, it doesn't mean none will be found in the future, but they are still probably better. Btw, Jeff already mentionned exploiting it would be very hard : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008675.html
Let's first use the example of one hashing function, such as MD5. First, you have the original file's hash. In all flaw finding exercises, this is not the case- all they look for is for two hashes to match, not trying to match one to a preexisting hash.
Jason already mentionned that in an old mail (that the usual case is looking for two hashes to match), and I gave a link to it in my answer to Jeff's mail there : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008691.html And if the flaws that has been found for MD5 or SHA1 only concern collisions, then it shouldn't even affect our use case.
So we are already off to a hard challenge here. Say you are able to find some other junk data that hashes to the same value. Sorry- that is worthless. You need valid data.
See above for link to Jeff's mail. I would agree too, but finally I'm missing some technical knowledge, about for example what makes a valid tarball, and if there isn't a way to somehow hide additional junk data or something. But anyway, I doubt this is the weakest point of Arch, and the first thing a potential attacker would try to exploit.
Now add a second hash. You will need a *double* collision of data- one where *both* hashes are the same for the valid data and the malicious data. I dare to say impossible. </offtopic>
So did we just find a new magic perfectly secure hash algorithm, better than all the existing ones, and no one has ever thought about it before ? Of course, it'll be more secure than using just one hash, the question is how does it compare to another hash using the same number of bits (128 + 160), or even less, like SHA-256. I just did some searching, which seems to suggest combining SHA1 and MD5 is indeed a poor idea for a hash algorithm. But that's still without the constraint of having a valid tarball etc.. and for finding collisions, not a second antecedent. But again, since we have all these additional constraints, md5 or sha1 alone may very well already be highly secure, and as I said above, with our use cases, it might be totally unaffected by the flaws that have been found (which apparently only concern collisions).
One hash being more secure? Doubtful. Maybe about the same.
I personnaly have no doubt :) They don't even have the same size. Suppose you use a hash of only 1 bit, the result for the original package is either 0 or 1. Then you pick any compromised package, you have 1/2 chance it'll match :) So the bigger, the more secure (as long as it doesn't have flaws).
One hash being less complicated? Do you like dealing with 80 character strings?
As much as I like dealing with 2 hash of 40 characters each, without mentionning that would make the code a bit cleaner in some places, because there is only one check to make instead of two. Anyway, I'm not suggesting to move to a stronger hash, it's actually the opposite, I'm suggesting to keep either md5 or sha1, unless someone can prove using only one isn't secure enough :)
There's no need for a second hashing algorithm. MD5 serves the purpose of verifying that a package file hasn't been corrupted during download. Signed-off-by: Andrew Fyfe <andrew@neptune-one.net> --- contrib/vimproject | 2 - lib/libalpm/Makefile.am | 1 - lib/libalpm/add.c | 37 +---- lib/libalpm/alpm.h | 3 - lib/libalpm/backup.c | 6 +- lib/libalpm/be_files.c | 11 +- lib/libalpm/package.c | 65 ------- lib/libalpm/package.h | 2 - lib/libalpm/remove.c | 1 - lib/libalpm/sha1.c | 431 ----------------------------------------------- lib/libalpm/sha1.h | 72 -------- lib/libalpm/sync.c | 20 +-- src/pacman/package.c | 21 +-- 13 files changed, 22 insertions(+), 650 deletions(-) diff --git a/contrib/vimproject b/contrib/vimproject index f54c6c1..b9bd7a4 100644 --- a/contrib/vimproject +++ b/contrib/vimproject @@ -29,7 +29,6 @@ pacman=~/devel/pacman-lib CD=. flags=S { provide.c remove.c server.c - sha1.c sync.c trans.c util.c @@ -50,7 +49,6 @@ pacman=~/devel/pacman-lib CD=. flags=S { provide.h remove.h server.h - sha1.h sync.h trans.h util.h diff --git a/lib/libalpm/Makefile.am b/lib/libalpm/Makefile.am index c5276f9..fd35426 100644 --- a/lib/libalpm/Makefile.am +++ b/lib/libalpm/Makefile.am @@ -29,7 +29,6 @@ libalpm_la_SOURCES = \ provide.h provide.c \ remove.h remove.c \ server.h server.c \ - sha1.h sha1.c \ sync.h sync.c \ trans.h trans.c \ util.h util.c diff --git a/lib/libalpm/add.c b/lib/libalpm/add.c index e532304..7a6446c 100644 --- a/lib/libalpm/add.c +++ b/lib/libalpm/add.c @@ -42,7 +42,6 @@ #include "error.h" #include "cache.h" #include "md5.h" -#include "sha1.h" #include "log.h" #include "backup.h" #include "package.h" @@ -336,7 +335,6 @@ static int extract_single_file(struct archive *archive, char filename[PATH_MAX]; /* the actual file we're extracting */ int needbackup = 0, notouch = 0; char *hash_orig = NULL; - int use_md5 = 0; const int archive_flags = ARCHIVE_EXTRACT_OWNER | ARCHIVE_EXTRACT_PERM | ARCHIVE_EXTRACT_TIME; @@ -485,10 +483,6 @@ static int extract_single_file(struct archive *archive, /* case 5,8: don't need to do anything special */ } - if(strlen(newpkg->sha1sum) == 0) { - use_md5 = 1; - } - if(needbackup) { char *tempfile = NULL; char *hash_local = NULL, *hash_pkg = NULL; @@ -516,15 +510,10 @@ static int extract_single_file(struct archive *archive, return(1); } - if(use_md5) { - hash_local = _alpm_MDFile(filename); - hash_pkg = _alpm_MDFile(tempfile); - } else { - hash_local = _alpm_SHAFile(filename); - hash_pkg = _alpm_SHAFile(tempfile); - } + hash_local = _alpm_MDFile(filename); + hash_pkg = _alpm_MDFile(tempfile); - /* append the new md5 or sha1 hash to it's respective entry + /* append the new md5 hash to it's respective entry * in newpkg's backup (it will be the new orginal) */ alpm_list_t *backups; for(backups = alpm_pkg_get_backup(newpkg); backups; @@ -534,14 +523,7 @@ static int extract_single_file(struct archive *archive, return(0); } char *backup = NULL; - int backup_len = strlen(oldbackup) + 2; /* tab char and null byte */ - - if(use_md5) { - backup_len += 32; /* MD5s are 32 chars in length */ - } else { - backup_len += 40; /* SHA1s are 40 chars in length */ - } - + int backup_len = strlen(oldbackup) + 34; /* tab char, null byte and MD5 (32 char) */ backup = malloc(backup_len); if(!backup) { RET_ERR(PM_ERR_MEMORY, -1); @@ -673,21 +655,14 @@ static int extract_single_file(struct archive *archive, for(b = alpm_pkg_get_backup(newpkg); b; b = b->next) { char *backup = NULL, *hash = NULL; char *oldbackup = alpm_list_getdata(b); - int backup_len = strlen(oldbackup) + 2; /* tab char and null byte */ + int backup_len = strlen(oldbackup) + 34; /* tab char, null byte and MD5 (32 char) */ if(!oldbackup || strcmp(oldbackup, entryname) != 0) { return(0); } _alpm_log(PM_LOG_DEBUG, "appending backup entry for %s", filename); - if(use_md5) { - backup_len += 32; /* MD5s are 32 chars in length */ - hash = _alpm_MDFile(filename); - } else { - backup_len += 40; /* SHA1s are 40 chars in length */ - hash = _alpm_SHAFile(filename); - } - + hash = _alpm_MDFile(filename); backup = malloc(backup_len); if(!backup) { RET_ERR(PM_ERR_MEMORY, -1); diff --git a/lib/libalpm/alpm.h b/lib/libalpm/alpm.h index 9e641f3..a700f75 100644 --- a/lib/libalpm/alpm.h +++ b/lib/libalpm/alpm.h @@ -187,7 +187,6 @@ typedef enum _pmpkghasarch_t { int alpm_pkg_load(const char *filename, pmpkg_t **pkg); int alpm_pkg_free(pmpkg_t *pkg); int alpm_pkg_checkmd5sum(pmpkg_t *pkg); -int alpm_pkg_checksha1sum(pmpkg_t *pkg); char *alpm_fetch_pkgurl(const char *url); int alpm_pkg_vercmp(const char *ver1, const char *ver2); char *alpm_pkg_name_hasarch(const char *pkgname); @@ -202,7 +201,6 @@ const char *alpm_pkg_get_buildtype(pmpkg_t *pkg); const char *alpm_pkg_get_installdate(pmpkg_t *pkg); const char *alpm_pkg_get_packager(pmpkg_t *pkg); const char *alpm_pkg_get_md5sum(pmpkg_t *pkg); -const char *alpm_pkg_get_sha1sum(pmpkg_t *pkg); const char *alpm_pkg_get_arch(pmpkg_t *pkg); unsigned long alpm_pkg_get_size(pmpkg_t *pkg); unsigned long alpm_pkg_get_isize(pmpkg_t *pkg); @@ -381,7 +379,6 @@ const char *alpm_conflict_get_ctarget(pmconflict_t *conflict); /* checksums */ char *alpm_get_md5sum(char *name); -char *alpm_get_sha1sum(char *name); /* * Errors diff --git a/lib/libalpm/backup.c b/lib/libalpm/backup.c index ffd4508..9cdb794 100644 --- a/lib/libalpm/backup.c +++ b/lib/libalpm/backup.c @@ -34,7 +34,7 @@ #include "util.h" /* Look for a filename in a pmpkg_t.backup list. If we find it, - * then we return the md5 or sha1 hash (parsed from the same line) + * then we return the md5 hash (parsed from the same line) */ char *_alpm_needbackup(const char *file, const alpm_list_t *backup) { @@ -46,7 +46,7 @@ char *_alpm_needbackup(const char *file, const alpm_list_t *backup) return(NULL); } - /* run through the backup list and parse out the md5 or sha1 hash for our file */ + /* run through the backup list and parse out the md5 hash for our file */ for(lp = backup; lp; lp = lp->next) { char *str = strdup(lp->data); char *ptr; @@ -59,7 +59,7 @@ char *_alpm_needbackup(const char *file, const alpm_list_t *backup) } *ptr = '\0'; ptr++; - /* now str points to the filename and ptr points to the md5 or sha1 hash */ + /* now str points to the filename and ptr points to the md5 hash */ if(strcmp(file, str) == 0) { char *hash = strdup(ptr); FREE(str); diff --git a/lib/libalpm/be_files.c b/lib/libalpm/be_files.c index ea00563..b3d06a9 100644 --- a/lib/libalpm/be_files.c +++ b/lib/libalpm/be_files.c @@ -391,12 +391,6 @@ int _alpm_db_read(pmdb_t *db, pmpkg_t *info, pmdbinfrq_t inforeq) } _alpm_strtrim(tmp); info->isize = atol(tmp); - } else if(!strcmp(line, "%SHA1SUM%")) { - /* SHA1SUM tag only appears in sync repositories, - * not the local one. */ - if(fgets(info->sha1sum, sizeof(info->sha1sum), fp) == NULL) { - goto error; - } } else if(!strcmp(line, "%MD5SUM%")) { /* MD5SUM tag only appears in sync repositories, * not the local one. */ @@ -607,10 +601,7 @@ int _alpm_db_write(pmdb_t *db, pmpkg_t *info, pmdbinfrq_t inforeq) fprintf(fp, "%%ISIZE%%\n" "%lu\n\n", info->isize); } - if(info->sha1sum) { - fprintf(fp, "%%SHA1SUM%%\n" - "%s\n\n", info->sha1sum); - } else if(info->md5sum) { + if(info->md5sum) { fprintf(fp, "%%MD5SUM%%\n" "%s\n\n", info->md5sum); } diff --git a/lib/libalpm/package.c b/lib/libalpm/package.c index d5eca20..598979e 100644 --- a/lib/libalpm/package.c +++ b/lib/libalpm/package.c @@ -95,57 +95,6 @@ int SYMEXPORT alpm_pkg_free(pmpkg_t *pkg) return(0); } -/** Check the integrity (with sha1) of a package from the sync cache. - * @param pkg package pointer - * @return 0 on success, -1 on error (pm_errno is set accordingly) - */ -int SYMEXPORT alpm_pkg_checksha1sum(pmpkg_t *pkg) -{ - char path[PATH_MAX]; - struct stat buf; - char *sha1sum = NULL; - alpm_list_t *i; - int retval = 0; - - ALPM_LOG_FUNC; - - ASSERT(pkg != NULL, RET_ERR(PM_ERR_WRONG_ARGS, -1)); - /* We only inspect packages from sync repositories */ - ASSERT(pkg->origin == PKG_FROM_CACHE, RET_ERR(PM_ERR_PKG_INVALID, -1)); - ASSERT(pkg->data != handle->db_local, RET_ERR(PM_ERR_PKG_INVALID, -1)); - - /* Loop through the cache dirs until we find a matching file */ - for(i = alpm_option_get_cachedirs(); i; i = alpm_list_next(i)) { - snprintf(path, PATH_MAX, "%s%s-%s" PKGEXT, (char*)alpm_list_getdata(i), - alpm_pkg_get_name(pkg), alpm_pkg_get_version(pkg)); - if(stat(path, &buf) == 0) { - break; - } - } - - sha1sum = alpm_get_sha1sum(path); - if(sha1sum == NULL) { - _alpm_log(PM_LOG_ERROR, _("could not get sha1sum for package %s-%s"), - alpm_pkg_get_name(pkg), alpm_pkg_get_version(pkg)); - pm_errno = PM_ERR_NOT_A_FILE; - retval = -1; - } else { - if(strcmp(sha1sum, alpm_pkg_get_sha1sum(pkg)) == 0) { - _alpm_log(PM_LOG_DEBUG, "sha1sums for package %s-%s match", - alpm_pkg_get_name(pkg), alpm_pkg_get_version(pkg)); - } else { - _alpm_log(PM_LOG_ERROR, _("sha1sums do not match for package %s-%s"), - alpm_pkg_get_name(pkg), alpm_pkg_get_version(pkg)); - pm_errno = PM_ERR_PKG_INVALID; - retval = -1; - } - } - - FREE(sha1sum); - - return(retval); -} - /** Check the integrity (with md5) of a package from the sync cache. * @param pkg package pointer * @return 0 on success, -1 on error (pm_errno is set accordingly) @@ -392,20 +341,6 @@ const char SYMEXPORT *alpm_pkg_get_md5sum(pmpkg_t *pkg) return pkg->md5sum; } -const char SYMEXPORT *alpm_pkg_get_sha1sum(pmpkg_t *pkg) -{ - ALPM_LOG_FUNC; - - /* Sanity checks */ - ASSERT(handle != NULL, return(NULL)); - ASSERT(pkg != NULL, return(NULL)); - - if(pkg->origin == PKG_FROM_CACHE && !(pkg->infolevel & INFRQ_DESC)) { - _alpm_db_read(pkg->data, pkg, INFRQ_DESC); - } - return pkg->sha1sum; -} - const char SYMEXPORT *alpm_pkg_get_arch(pmpkg_t *pkg) { ALPM_LOG_FUNC; diff --git a/lib/libalpm/package.h b/lib/libalpm/package.h index f704ab9..f6cb3d0 100644 --- a/lib/libalpm/package.h +++ b/lib/libalpm/package.h @@ -46,7 +46,6 @@ typedef enum _pmpkgfrom_t { #define PKG_TYPE_LEN 32 #define PKG_PACKAGER_LEN 64 #define PKG_MD5SUM_LEN 33 -#define PKG_SHA1SUM_LEN 41 #define PKG_ARCH_LEN 32 struct __pmpkg_t { @@ -59,7 +58,6 @@ struct __pmpkg_t { char installdate[PKG_DATE_LEN]; char packager[PKG_PACKAGER_LEN]; char md5sum[PKG_MD5SUM_LEN]; - char sha1sum[PKG_SHA1SUM_LEN]; char arch[PKG_ARCH_LEN]; unsigned long size; unsigned long isize; diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 33c122d..22d209a 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -40,7 +40,6 @@ #include "util.h" #include "error.h" #include "md5.h" -#include "sha1.h" #include "log.h" #include "backup.h" #include "package.h" diff --git a/lib/libalpm/sha1.c b/lib/libalpm/sha1.c deleted file mode 100644 index a164a89..0000000 --- a/lib/libalpm/sha1.c +++ /dev/null @@ -1,431 +0,0 @@ -/* sha.c - Functions to compute SHA1 message digest of files or - memory blocks according to the NIST specification FIPS-180-1. - - Copyright (C) 2000, 2001, 2003 Free Software Foundation, Inc. - - This program is free software; you can redistribute it and/or modify it - under the terms of the GNU General Public License as published by the - Free Software Foundation; either version 2, or (at your option) any - later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software Foundation, - Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ - -/* Written by Scott G. Miller - Credits: - Robert Klep <robert@ilse.nl> -- Expansion function fix -*/ - -#include "config.h" - -#include <sys/types.h> -#include <stdlib.h> -#include <string.h> - -/* libalpm */ -#include "sha1.h" -#include "alpm.h" -#include "log.h" -#include "util.h" - -/* - Not-swap is a macro that does an endian swap on architectures that are - big-endian, as SHA needs some data in a little-endian format -*/ - -#ifdef WORDS_BIGENDIAN -# define NOTSWAP(n) (n) -# define SWAP(n) \ - (((n) << 24) | (((n) & 0xff00) << 8) | (((n) >> 8) & 0xff00) | ((n) >> 24)) -#else -# define NOTSWAP(n) \ - (((n) << 24) | (((n) & 0xff00) << 8) | (((n) >> 8) & 0xff00) | ((n) >> 24)) -# define SWAP(n) (n) -#endif - -#define BLOCKSIZE 4096 -/* Ensure that BLOCKSIZE is a multiple of 64. */ -#if BLOCKSIZE % 64 != 0 -/* FIXME-someday (soon?): use #error instead of this kludge. */ -"invalid BLOCKSIZE" -#endif - -/* This array contains the bytes used to pad the buffer to the next - 64-byte boundary. (RFC 1321, 3.1: Step 1) */ -static const unsigned char fillbuf[64] = { 0x80, 0 /* , 0, 0, ... */ }; - - -/* Starting with the result of former calls of this function (or the - initialization function update the context for the next LEN bytes - starting at BUFFER. - It is necessary that LEN is a multiple of 64!!! */ -static void sha_process_block (const void *buffer, size_t len, - struct sha_ctx *ctx); - -/* Starting with the result of former calls of this function (or the - initialization function update the context for the next LEN bytes - starting at BUFFER. - It is NOT required that LEN is a multiple of 64. */ -static void sha_process_bytes (const void *buffer, size_t len, - struct sha_ctx *ctx); - -/* Put result from CTX in first 20 bytes following RESBUF. The result is - always in little endian byte order, so that a byte-wise output yields - to the wanted ASCII representation of the message digest. - - IMPORTANT: On some systems it is required that RESBUF is correctly - aligned for a 32 bits value. */ -static void *sha_read_ctx (const struct sha_ctx *ctx, void *resbuf); - -/* - Takes a pointer to a 160 bit block of data (five 32 bit ints) and - intializes it to the start constants of the SHA1 algorithm. This - must be called before using hash in the call to sha_hash -*/ -static void -sha_init_ctx (struct sha_ctx *ctx) -{ - ctx->A = 0x67452301; - ctx->B = 0xefcdab89; - ctx->C = 0x98badcfe; - ctx->D = 0x10325476; - ctx->E = 0xc3d2e1f0; - - ctx->total[0] = ctx->total[1] = 0; - ctx->buflen = 0; -} - -/* Put result from CTX in first 20 bytes following RESBUF. The result - must be in little endian byte order. - - IMPORTANT: On some systems it is required that RESBUF is correctly - aligned for a 32 bits value. */ -static void * -sha_read_ctx (const struct sha_ctx *ctx, void *resbuf) -{ - ((sha_uint32 *) resbuf)[0] = NOTSWAP (ctx->A); - ((sha_uint32 *) resbuf)[1] = NOTSWAP (ctx->B); - ((sha_uint32 *) resbuf)[2] = NOTSWAP (ctx->C); - ((sha_uint32 *) resbuf)[3] = NOTSWAP (ctx->D); - ((sha_uint32 *) resbuf)[4] = NOTSWAP (ctx->E); - - return resbuf; -} - -/* Process the remaining bytes in the internal buffer and the usual - prolog according to the standard and write the result to RESBUF. - - IMPORTANT: On some systems it is required that RESBUF is correctly - aligned for a 32 bits value. */ -static void * -sha_finish_ctx (struct sha_ctx *ctx, void *resbuf) -{ - /* Take yet unprocessed bytes into account. */ - sha_uint32 bytes = ctx->buflen; - size_t pad; - - /* Now count remaining bytes. */ - ctx->total[0] += bytes; - if (ctx->total[0] < bytes) - ++ctx->total[1]; - - pad = bytes >= 56 ? 64 + 56 - bytes : 56 - bytes; - memcpy (&ctx->buffer[bytes], fillbuf, pad); - - /* Put the 64-bit file length in *bits* at the end of the buffer. */ - *(sha_uint32 *) &ctx->buffer[bytes + pad + 4] = NOTSWAP (ctx->total[0] << 3); - *(sha_uint32 *) &ctx->buffer[bytes + pad] = NOTSWAP ((ctx->total[1] << 3) | - (ctx->total[0] >> 29)); - - /* Process last bytes. */ - sha_process_block (ctx->buffer, bytes + pad + 8, ctx); - - return sha_read_ctx (ctx, resbuf); -} - -static void -sha_process_bytes (const void *buffer, size_t len, struct sha_ctx *ctx) -{ - /* When we already have some bits in our internal buffer concatenate - both inputs first. */ - if (ctx->buflen != 0) - { - size_t left_over = ctx->buflen; - size_t add = 128 - left_over > len ? len : 128 - left_over; - - memcpy (&ctx->buffer[left_over], buffer, add); - ctx->buflen += add; - - if (ctx->buflen > 64) - { - sha_process_block (ctx->buffer, ctx->buflen & ~63, ctx); - - ctx->buflen &= 63; - /* The regions in the following copy operation cannot overlap. */ - memcpy (ctx->buffer, &ctx->buffer[(left_over + add) & ~63], - ctx->buflen); - } - - buffer = (const char *) buffer + add; - len -= add; - } - - /* Process available complete blocks. */ - if (len >= 64) - { -#if !_STRING_ARCH_unaligned -/* To check alignment gcc has an appropriate operator. Other - compilers don't. */ -# if __GNUC__ >= 2 -# define UNALIGNED_P(p) (((sha_uintptr) p) % __alignof__ (sha_uint32) != 0) -# else -# define UNALIGNED_P(p) (((sha_uintptr) p) % sizeof (sha_uint32) != 0) -# endif - if (UNALIGNED_P (buffer)) - while (len > 64) - { - sha_process_block (memcpy (ctx->buffer, buffer, 64), 64, ctx); - buffer = (const char *) buffer + 64; - len -= 64; - } - else -#endif - { - sha_process_block (buffer, len & ~63, ctx); - buffer = (const char *) buffer + (len & ~63); - len &= 63; - } - } - - /* Move remaining bytes in internal buffer. */ - if (len > 0) - { - size_t left_over = ctx->buflen; - - memcpy (&ctx->buffer[left_over], buffer, len); - left_over += len; - if (left_over >= 64) - { - sha_process_block (ctx->buffer, 64, ctx); - left_over -= 64; - memcpy (ctx->buffer, &ctx->buffer[64], left_over); - } - ctx->buflen = left_over; - } -} - -/* --- Code below is the primary difference between md5.c and sha.c --- */ - -/* SHA1 round constants */ -#define K1 0x5a827999L -#define K2 0x6ed9eba1L -#define K3 0x8f1bbcdcL -#define K4 0xca62c1d6L - -/* Round functions. Note that F2 is the same as F4. */ -#define F1(B,C,D) ( D ^ ( B & ( C ^ D ) ) ) -#define F2(B,C,D) (B ^ C ^ D) -#define F3(B,C,D) ( ( B & C ) | ( D & ( B | C ) ) ) -#define F4(B,C,D) (B ^ C ^ D) - -/* Process LEN bytes of BUFFER, accumulating context into CTX. - It is assumed that LEN % 64 == 0. - Most of this code comes from GnuPG's cipher/sha1.c. */ - -static void -sha_process_block (const void *buffer, size_t len, struct sha_ctx *ctx) -{ - const sha_uint32 *words = buffer; - size_t nwords = len / sizeof (sha_uint32); - const sha_uint32 *endp = words + nwords; - sha_uint32 x[16]; - sha_uint32 a = ctx->A; - sha_uint32 b = ctx->B; - sha_uint32 c = ctx->C; - sha_uint32 d = ctx->D; - sha_uint32 e = ctx->E; - - /* First increment the byte count. RFC 1321 specifies the possible - length of the file up to 2^64 bits. Here we only compute the - number of bytes. Do a double word increment. */ - ctx->total[0] += len; - if (ctx->total[0] < len) - ++ctx->total[1]; - -#define M(I) ( tm = x[I&0x0f] ^ x[(I-14)&0x0f] \ - ^ x[(I-8)&0x0f] ^ x[(I-3)&0x0f] \ - , (x[I&0x0f] = rol(tm, 1)) ) - -#define R(A,B,C,D,E,F,K,M) do { E += rol( A, 5 ) \ - + F( B, C, D ) \ - + K \ - + M; \ - B = rol( B, 30 ); \ - } while(0) - - while (words < endp) - { - sha_uint32 tm; - int t; - /* FIXME: see sha1.c for a better implementation. */ - for (t = 0; t < 16; t++) - { - x[t] = NOTSWAP (*words); - words++; - } - - R( a, b, c, d, e, F1, K1, x[ 0] ); - R( e, a, b, c, d, F1, K1, x[ 1] ); - R( d, e, a, b, c, F1, K1, x[ 2] ); - R( c, d, e, a, b, F1, K1, x[ 3] ); - R( b, c, d, e, a, F1, K1, x[ 4] ); - R( a, b, c, d, e, F1, K1, x[ 5] ); - R( e, a, b, c, d, F1, K1, x[ 6] ); - R( d, e, a, b, c, F1, K1, x[ 7] ); - R( c, d, e, a, b, F1, K1, x[ 8] ); - R( b, c, d, e, a, F1, K1, x[ 9] ); - R( a, b, c, d, e, F1, K1, x[10] ); - R( e, a, b, c, d, F1, K1, x[11] ); - R( d, e, a, b, c, F1, K1, x[12] ); - R( c, d, e, a, b, F1, K1, x[13] ); - R( b, c, d, e, a, F1, K1, x[14] ); - R( a, b, c, d, e, F1, K1, x[15] ); - R( e, a, b, c, d, F1, K1, M(16) ); - R( d, e, a, b, c, F1, K1, M(17) ); - R( c, d, e, a, b, F1, K1, M(18) ); - R( b, c, d, e, a, F1, K1, M(19) ); - R( a, b, c, d, e, F2, K2, M(20) ); - R( e, a, b, c, d, F2, K2, M(21) ); - R( d, e, a, b, c, F2, K2, M(22) ); - R( c, d, e, a, b, F2, K2, M(23) ); - R( b, c, d, e, a, F2, K2, M(24) ); - R( a, b, c, d, e, F2, K2, M(25) ); - R( e, a, b, c, d, F2, K2, M(26) ); - R( d, e, a, b, c, F2, K2, M(27) ); - R( c, d, e, a, b, F2, K2, M(28) ); - R( b, c, d, e, a, F2, K2, M(29) ); - R( a, b, c, d, e, F2, K2, M(30) ); - R( e, a, b, c, d, F2, K2, M(31) ); - R( d, e, a, b, c, F2, K2, M(32) ); - R( c, d, e, a, b, F2, K2, M(33) ); - R( b, c, d, e, a, F2, K2, M(34) ); - R( a, b, c, d, e, F2, K2, M(35) ); - R( e, a, b, c, d, F2, K2, M(36) ); - R( d, e, a, b, c, F2, K2, M(37) ); - R( c, d, e, a, b, F2, K2, M(38) ); - R( b, c, d, e, a, F2, K2, M(39) ); - R( a, b, c, d, e, F3, K3, M(40) ); - R( e, a, b, c, d, F3, K3, M(41) ); - R( d, e, a, b, c, F3, K3, M(42) ); - R( c, d, e, a, b, F3, K3, M(43) ); - R( b, c, d, e, a, F3, K3, M(44) ); - R( a, b, c, d, e, F3, K3, M(45) ); - R( e, a, b, c, d, F3, K3, M(46) ); - R( d, e, a, b, c, F3, K3, M(47) ); - R( c, d, e, a, b, F3, K3, M(48) ); - R( b, c, d, e, a, F3, K3, M(49) ); - R( a, b, c, d, e, F3, K3, M(50) ); - R( e, a, b, c, d, F3, K3, M(51) ); - R( d, e, a, b, c, F3, K3, M(52) ); - R( c, d, e, a, b, F3, K3, M(53) ); - R( b, c, d, e, a, F3, K3, M(54) ); - R( a, b, c, d, e, F3, K3, M(55) ); - R( e, a, b, c, d, F3, K3, M(56) ); - R( d, e, a, b, c, F3, K3, M(57) ); - R( c, d, e, a, b, F3, K3, M(58) ); - R( b, c, d, e, a, F3, K3, M(59) ); - R( a, b, c, d, e, F4, K4, M(60) ); - R( e, a, b, c, d, F4, K4, M(61) ); - R( d, e, a, b, c, F4, K4, M(62) ); - R( c, d, e, a, b, F4, K4, M(63) ); - R( b, c, d, e, a, F4, K4, M(64) ); - R( a, b, c, d, e, F4, K4, M(65) ); - R( e, a, b, c, d, F4, K4, M(66) ); - R( d, e, a, b, c, F4, K4, M(67) ); - R( c, d, e, a, b, F4, K4, M(68) ); - R( b, c, d, e, a, F4, K4, M(69) ); - R( a, b, c, d, e, F4, K4, M(70) ); - R( e, a, b, c, d, F4, K4, M(71) ); - R( d, e, a, b, c, F4, K4, M(72) ); - R( c, d, e, a, b, F4, K4, M(73) ); - R( b, c, d, e, a, F4, K4, M(74) ); - R( a, b, c, d, e, F4, K4, M(75) ); - R( e, a, b, c, d, F4, K4, M(76) ); - R( d, e, a, b, c, F4, K4, M(77) ); - R( c, d, e, a, b, F4, K4, M(78) ); - R( b, c, d, e, a, F4, K4, M(79) ); - - a = ctx->A += a; - b = ctx->B += b; - c = ctx->C += c; - d = ctx->D += d; - e = ctx->E += e; - } -} - -/* Copyright (C) 1990-2, RSA Data Security, Inc. Created 1990. All -rights reserved. - -RSA Data Security, Inc. makes no representations concerning either -the merchantability of this software or the suitability of this -software for any particular purpose. It is provided "as is" -without express or implied warranty of any kind. - -These notices must be retained in any copies of any part of this -documentation and/or software. - */ - -/** Get the sha1 sum of file. - * @param name name of the file - * @return the checksum on success, NULL on error - * @addtogroup alpm_misc - */ -char SYMEXPORT *alpm_get_sha1sum(char *name) -{ - ALPM_LOG_FUNC; - - ASSERT(name != NULL, return(NULL)); - - return(_alpm_SHAFile(name)); -} - -char* _alpm_SHAFile(char *filename) { - FILE *file; - struct sha_ctx context; - int len, i; - char hex[3]; - unsigned char buffer[1024], digest[20]; - char *ret; - - ALPM_LOG_FUNC; - - if((file = fopen(filename, "rb")) == NULL) { - _alpm_log(PM_LOG_ERROR, _("sha1: %s can't be opened\n"), filename); - } else { - sha_init_ctx(&context); - while((len = fread(buffer, 1, 1024, file))) { - sha_process_bytes(buffer, len, &context); - } - sha_finish_ctx(&context, digest); - fclose(file); - - ret = (char*)malloc(41); - ret[0] = '\0'; - for(i = 0; i < 20; i++) { - snprintf(hex, 3, "%02x", digest[i]); - strncat(ret, hex, 2); - } - _alpm_log(PM_LOG_DEBUG, "sha1(%s) = %s", filename, ret); - return(ret); - } - - return(NULL); -} - -/* vim: set ts=2 sw=2 noet: */ diff --git a/lib/libalpm/sha1.h b/lib/libalpm/sha1.h deleted file mode 100644 index fc0aa23..0000000 --- a/lib/libalpm/sha1.h +++ /dev/null @@ -1,72 +0,0 @@ -/* Declarations of functions and data types used for SHA1 sum - library functions. - Copyright (C) 2000, 2001, 2003 Free Software Foundation, Inc. - - This program is free software; you can redistribute it and/or modify it - under the terms of the GNU General Public License as published by the - Free Software Foundation; either version 2, or (at your option) any - later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software Foundation, - Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ -#ifndef _ALPM_SHA1_H -#define _ALPM_SHA1_H - -#include <stdio.h> -#include <limits.h> - -#define rol(x,n) ( ((x) << (n)) | ((x) >> (32 -(n))) ) -/* TODO check this comment */ -/* The code below is from md5.h (from coreutils), little modifications */ -#define UINT_MAX_32_BITS 4294967295U - -/* This new ifdef allows splint to not fail on its static code check */ -#ifdef S_SPLINT_S - typedef unsigned int sha_uint32; -#else -#if UINT_MAX == UINT_MAX_32_BITS - typedef unsigned int sha_uint32; -#else -#if USHRT_MAX == UINT_MAX_32_BITS - typedef unsigned short sha_uint32; -#else -#if ULONG_MAX == UINT_MAX_32_BITS - typedef unsigned long sha_uint32; -#else - /* The following line is intended to evoke an error. Using #error is not portable enough. */ -#error "Cannot determine unsigned 32-bit data type" -#endif /* ULONG_MAX */ -#endif /* USHRT_MAX */ -#endif /* UINT_MAX */ -#endif /* S_SPLINT_S */ -/* We have to make a guess about the integer type equivalent in size - to pointers which should always be correct. */ -typedef unsigned long int sha_uintptr; - -/* Structure to save state of computation between the single steps. */ -struct sha_ctx -{ - sha_uint32 A; - sha_uint32 B; - sha_uint32 C; - sha_uint32 D; - sha_uint32 E; - - sha_uint32 total[2]; - sha_uint32 buflen; - char buffer[128]; -}; - - -/* Needed for pacman */ -char *_alpm_SHAFile (char *); - -#endif /* _ALPM_SHA1_H */ - -/* vim: set ts=2 sw=2 noet: */ diff --git a/lib/libalpm/sync.c b/lib/libalpm/sync.c index 005123d..a456aae 100644 --- a/lib/libalpm/sync.c +++ b/lib/libalpm/sync.c @@ -48,7 +48,6 @@ #include "handle.h" #include "alpm.h" #include "md5.h" -#include "sha1.h" #include "server.h" pmsyncpkg_t *_alpm_sync_new(int type, pmpkg_t *spkg, void *data) @@ -808,19 +807,18 @@ int _alpm_sync_commit(pmtrans_t *trans, pmdb_t *db_local, alpm_list_t **data) char str[PATH_MAX]; struct stat buf; const char *pkgname; - char *md5sum1, *md5sum2, *sha1sum1, *sha1sum2; + char *md5sum1, *md5sum2; char *ptr=NULL; pkgname = alpm_pkg_get_filename(spkg); md5sum1 = spkg->md5sum; - sha1sum1 = spkg->sha1sum; - if((md5sum1 == NULL) && (sha1sum1 == NULL)) { + if(md5sum1 == NULL) { /* TODO wtf is this? malloc'd strings for error messages? */ if((ptr = calloc(512, sizeof(char))) == NULL) { RET_ERR(PM_ERR_MEMORY, -1); } - snprintf(ptr, 512, _("can't get md5 or sha1 checksum for package %s\n"), pkgname); + snprintf(ptr, 512, _("can't get md5 checksum for package %s\n"), pkgname); *data = alpm_list_add(*data, ptr); retval = 1; continue; @@ -837,17 +835,16 @@ int _alpm_sync_commit(pmtrans_t *trans, pmdb_t *db_local, alpm_list_t **data) } md5sum2 = alpm_get_md5sum(str); - sha1sum2 = alpm_get_sha1sum(str); - if(md5sum2 == NULL && sha1sum2 == NULL) { + if(md5sum2 == NULL) { if((ptr = calloc(512, sizeof(char))) == NULL) { RET_ERR(PM_ERR_MEMORY, -1); } - snprintf(ptr, 512, _("can't get md5 or sha1 checksum for package %s\n"), pkgname); + snprintf(ptr, 512, _("can't get md5 checksum for package %s\n"), pkgname); *data = alpm_list_add(*data, ptr); retval = 1; continue; } - if((strcmp(md5sum1, md5sum2) != 0) && (strcmp(sha1sum1, sha1sum2) != 0)) { + if(strcmp(md5sum1, md5sum2) != 0) { int doremove=0; if((ptr = calloc(512, sizeof(char))) == NULL) { RET_ERR(PM_ERR_MEMORY, -1); @@ -855,15 +852,14 @@ int _alpm_sync_commit(pmtrans_t *trans, pmdb_t *db_local, alpm_list_t **data) QUESTION(trans, PM_TRANS_CONV_CORRUPTED_PKG, (char *)pkgname, NULL, NULL, &doremove); if(doremove) { unlink(str); - snprintf(ptr, 512, _("archive %s was corrupted (bad MD5 or SHA1 checksum)\n"), pkgname); + snprintf(ptr, 512, _("archive %s was corrupted (bad MD5 checksum)\n"), pkgname); } else { - snprintf(ptr, 512, _("archive %s is corrupted (bad MD5 or SHA1 checksum)\n"), pkgname); + snprintf(ptr, 512, _("archive %s is corrupted (bad MD5 checksum)\n"), pkgname); } *data = alpm_list_add(*data, ptr); retval = 1; } FREE(md5sum2); - FREE(sha1sum2); } if(retval) { pm_errno = PM_ERR_PKG_CORRUPTED; diff --git a/src/pacman/package.c b/src/pacman/package.c index 3a3381f..86d91ec 100644 --- a/src/pacman/package.c +++ b/src/pacman/package.c @@ -110,7 +110,7 @@ void dump_pkg_full(pmpkg_t *pkg, int level) */ void dump_pkg_sync(pmpkg_t *pkg, const char *treename) { - const char *descheader, *md5sum, *sha1sum; + const char *descheader, *md5sum; if(pkg == NULL) { return; } @@ -118,7 +118,6 @@ void dump_pkg_sync(pmpkg_t *pkg, const char *treename) descheader = _("Description : "); md5sum = alpm_pkg_get_md5sum(pkg); - sha1sum = alpm_pkg_get_sha1sum(pkg); printf(_("Repository : %s\n"), treename); printf(_("Name : %s\n"), (char *)alpm_pkg_get_name(pkg)); @@ -139,9 +138,6 @@ void dump_pkg_sync(pmpkg_t *pkg, const char *treename) if (md5sum != NULL && md5sum[0] != '\0') { printf(_("MD5 Sum : %s"), md5sum); } - if (sha1sum != NULL && sha1sum[0] != '\0') { - printf(_("SHA1 Sum : %s"), sha1sum); - } printf("\n"); } @@ -168,31 +164,22 @@ void dump_pkg_backups(pmpkg_t *pkg) snprintf(path, PATH_MAX-1, "%s%s", root, str); /* if we find the file, calculate checksums, otherwise it is missing */ if(!stat(path, &buf)) { - char *sum; char *md5sum = alpm_get_md5sum(path); - char *sha1sum = alpm_get_sha1sum(path); - if(md5sum == NULL || sha1sum == NULL) { + if(md5sum == NULL) { fprintf(stderr, _("error: could not calculate checksums for %s\n"), path); free(str); continue; } - /* TODO Is this a good way to check type of backup stored? - * We aren't storing it anywhere in the database. */ - if (strlen(ptr) == 32) { - sum = md5sum; - } else { /*if (strlen(ptr) == 40) */ - sum = sha1sum; - } + /* if checksums don't match, file has been modified */ - if (strcmp(sum, ptr)) { + if (strcmp(md5sum, ptr)) { printf(_("MODIFIED\t%s\n"), path); } else { printf(_("Not Modified\t%s\n"), path); } free(md5sum); - free(sha1sum); } else { printf(_("MISSING\t\t%s\n"), path); } -- 1.5.2.4
* Move alpm md5 functions to lib/libalpm/util.c * Remove unneeded includes for md5.h * Replace md5 implementation with one from http://www.xyssl.org Signed-off-by: Andrew Fyfe <andrew@neptune-one.net> --- lib/libalpm/Makefile.am | 1 - lib/libalpm/add.c | 7 +- lib/libalpm/md5.c | 661 +++++++++++++++++++++++++++++------------------ lib/libalpm/md5.h | 144 ++++++++--- lib/libalpm/md5driver.c | 93 ------- lib/libalpm/remove.c | 1 - lib/libalpm/sync.c | 1 - lib/libalpm/util.c | 31 +++ 8 files changed, 547 insertions(+), 392 deletions(-) diff --git a/lib/libalpm/Makefile.am b/lib/libalpm/Makefile.am index fd35426..3705370 100644 --- a/lib/libalpm/Makefile.am +++ b/lib/libalpm/Makefile.am @@ -24,7 +24,6 @@ libalpm_la_SOURCES = \ handle.h handle.c \ log.h log.c \ md5.h md5.c \ - md5driver.c \ package.h package.c \ provide.h provide.c \ remove.h remove.c \ diff --git a/lib/libalpm/add.c b/lib/libalpm/add.c index 7a6446c..73c8b2d 100644 --- a/lib/libalpm/add.c +++ b/lib/libalpm/add.c @@ -41,7 +41,6 @@ #include "util.h" #include "error.h" #include "cache.h" -#include "md5.h" #include "log.h" #include "backup.h" #include "package.h" @@ -510,8 +509,8 @@ static int extract_single_file(struct archive *archive, return(1); } - hash_local = _alpm_MDFile(filename); - hash_pkg = _alpm_MDFile(tempfile); + hash_local = alpm_get_md5sum(filename); + hash_pkg = alpm_get_md5sum(tempfile); /* append the new md5 hash to it's respective entry * in newpkg's backup (it will be the new orginal) */ @@ -662,7 +661,7 @@ static int extract_single_file(struct archive *archive, } _alpm_log(PM_LOG_DEBUG, "appending backup entry for %s", filename); - hash = _alpm_MDFile(filename); + hash = alpm_get_md5sum(filename); backup = malloc(backup_len); if(!backup) { RET_ERR(PM_ERR_MEMORY, -1); diff --git a/lib/libalpm/md5.c b/lib/libalpm/md5.c index 6d5aa6a..f307260 100644 --- a/lib/libalpm/md5.c +++ b/lib/libalpm/md5.c @@ -1,307 +1,460 @@ -/* MD5C.C - RSA Data Security, Inc., MD5 message-digest algorithm +/* + * RFC 1321 compliant MD5 implementation + * + * Copyright (C) 2006-2007 Christophe Devine + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License, version 2.1 as published by the Free Software Foundation. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, + * MA 02110-1301 USA + */ +/* + * The MD5 algorithm was designed by Ron Rivest in 1991. + * + * http://www.ietf.org/rfc/rfc1321.txt */ -/* Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All -rights reserved. +#ifndef _CRT_SECURE_NO_DEPRECATE +#define _CRT_SECURE_NO_DEPRECATE 1 +#endif -License to copy and use this software is granted provided that it -is identified as the "RSA Data Security, Inc. MD5 Message-Digest -Algorithm" in all material mentioning or referencing this software -or this function. +#include <string.h> +#include <stdio.h> -License is also granted to make and use derivative works provided -that such works are identified as "derived from the RSA Data -Security, Inc. MD5 Message-Digest Algorithm" in all material -mentioning or referencing the derived work. +#include "md5.h" -RSA Data Security, Inc. makes no representations concerning either -the merchantability of this software or the suitability of this -software for any particular purpose. It is provided "as is" -without express or implied warranty of any kind. +/* + * 32-bit integer manipulation macros (little endian) + */ +#ifndef GET_UINT32_LE +#define GET_UINT32_LE(n,b,i) \ +{ \ + (n) = ( (unsigned long) (b)[(i) ] ) \ + | ( (unsigned long) (b)[(i) + 1] << 8 ) \ + | ( (unsigned long) (b)[(i) + 2] << 16 ) \ + | ( (unsigned long) (b)[(i) + 3] << 24 ); \ +} +#endif + +#ifndef PUT_UINT32_LE +#define PUT_UINT32_LE(n,b,i) \ +{ \ + (b)[(i) ] = (unsigned char) ( (n) ); \ + (b)[(i) + 1] = (unsigned char) ( (n) >> 8 ); \ + (b)[(i) + 2] = (unsigned char) ( (n) >> 16 ); \ + (b)[(i) + 3] = (unsigned char) ( (n) >> 24 ); \ +} +#endif -These notices must be retained in any copies of any part of this -documentation and/or software. +/* + * MD5 context setup */ +void md5_starts( md5_context *ctx ) +{ + ctx->total[0] = 0; + ctx->total[1] = 0; -#include <string.h> + ctx->state[0] = 0x67452301; + ctx->state[1] = 0xEFCDAB89; + ctx->state[2] = 0x98BADCFE; + ctx->state[3] = 0x10325476; +} -#include "md5.h" +static void md5_process( md5_context *ctx, unsigned char data[64] ) +{ + unsigned long X[16], A, B, C, D; + + GET_UINT32_LE( X[ 0], data, 0 ); + GET_UINT32_LE( X[ 1], data, 4 ); + GET_UINT32_LE( X[ 2], data, 8 ); + GET_UINT32_LE( X[ 3], data, 12 ); + GET_UINT32_LE( X[ 4], data, 16 ); + GET_UINT32_LE( X[ 5], data, 20 ); + GET_UINT32_LE( X[ 6], data, 24 ); + GET_UINT32_LE( X[ 7], data, 28 ); + GET_UINT32_LE( X[ 8], data, 32 ); + GET_UINT32_LE( X[ 9], data, 36 ); + GET_UINT32_LE( X[10], data, 40 ); + GET_UINT32_LE( X[11], data, 44 ); + GET_UINT32_LE( X[12], data, 48 ); + GET_UINT32_LE( X[13], data, 52 ); + GET_UINT32_LE( X[14], data, 56 ); + GET_UINT32_LE( X[15], data, 60 ); + +#define S(x,n) ((x << n) | ((x & 0xFFFFFFFF) >> (32 - n))) + +#define P(a,b,c,d,k,s,t) \ +{ \ + a += F(b,c,d) + X[k] + t; a = S(a,s) + b; \ +} + + A = ctx->state[0]; + B = ctx->state[1]; + C = ctx->state[2]; + D = ctx->state[3]; + +#define F(x,y,z) (z ^ (x & (y ^ z))) + + P( A, B, C, D, 0, 7, 0xD76AA478 ); + P( D, A, B, C, 1, 12, 0xE8C7B756 ); + P( C, D, A, B, 2, 17, 0x242070DB ); + P( B, C, D, A, 3, 22, 0xC1BDCEEE ); + P( A, B, C, D, 4, 7, 0xF57C0FAF ); + P( D, A, B, C, 5, 12, 0x4787C62A ); + P( C, D, A, B, 6, 17, 0xA8304613 ); + P( B, C, D, A, 7, 22, 0xFD469501 ); + P( A, B, C, D, 8, 7, 0x698098D8 ); + P( D, A, B, C, 9, 12, 0x8B44F7AF ); + P( C, D, A, B, 10, 17, 0xFFFF5BB1 ); + P( B, C, D, A, 11, 22, 0x895CD7BE ); + P( A, B, C, D, 12, 7, 0x6B901122 ); + P( D, A, B, C, 13, 12, 0xFD987193 ); + P( C, D, A, B, 14, 17, 0xA679438E ); + P( B, C, D, A, 15, 22, 0x49B40821 ); + +#undef F + +#define F(x,y,z) (y ^ (z & (x ^ y))) + + P( A, B, C, D, 1, 5, 0xF61E2562 ); + P( D, A, B, C, 6, 9, 0xC040B340 ); + P( C, D, A, B, 11, 14, 0x265E5A51 ); + P( B, C, D, A, 0, 20, 0xE9B6C7AA ); + P( A, B, C, D, 5, 5, 0xD62F105D ); + P( D, A, B, C, 10, 9, 0x02441453 ); + P( C, D, A, B, 15, 14, 0xD8A1E681 ); + P( B, C, D, A, 4, 20, 0xE7D3FBC8 ); + P( A, B, C, D, 9, 5, 0x21E1CDE6 ); + P( D, A, B, C, 14, 9, 0xC33707D6 ); + P( C, D, A, B, 3, 14, 0xF4D50D87 ); + P( B, C, D, A, 8, 20, 0x455A14ED ); + P( A, B, C, D, 13, 5, 0xA9E3E905 ); + P( D, A, B, C, 2, 9, 0xFCEFA3F8 ); + P( C, D, A, B, 7, 14, 0x676F02D9 ); + P( B, C, D, A, 12, 20, 0x8D2A4C8A ); + +#undef F + +#define F(x,y,z) (x ^ y ^ z) + + P( A, B, C, D, 5, 4, 0xFFFA3942 ); + P( D, A, B, C, 8, 11, 0x8771F681 ); + P( C, D, A, B, 11, 16, 0x6D9D6122 ); + P( B, C, D, A, 14, 23, 0xFDE5380C ); + P( A, B, C, D, 1, 4, 0xA4BEEA44 ); + P( D, A, B, C, 4, 11, 0x4BDECFA9 ); + P( C, D, A, B, 7, 16, 0xF6BB4B60 ); + P( B, C, D, A, 10, 23, 0xBEBFBC70 ); + P( A, B, C, D, 13, 4, 0x289B7EC6 ); + P( D, A, B, C, 0, 11, 0xEAA127FA ); + P( C, D, A, B, 3, 16, 0xD4EF3085 ); + P( B, C, D, A, 6, 23, 0x04881D05 ); + P( A, B, C, D, 9, 4, 0xD9D4D039 ); + P( D, A, B, C, 12, 11, 0xE6DB99E5 ); + P( C, D, A, B, 15, 16, 0x1FA27CF8 ); + P( B, C, D, A, 2, 23, 0xC4AC5665 ); + +#undef F + +#define F(x,y,z) (y ^ (x | ~z)) + + P( A, B, C, D, 0, 6, 0xF4292244 ); + P( D, A, B, C, 7, 10, 0x432AFF97 ); + P( C, D, A, B, 14, 15, 0xAB9423A7 ); + P( B, C, D, A, 5, 21, 0xFC93A039 ); + P( A, B, C, D, 12, 6, 0x655B59C3 ); + P( D, A, B, C, 3, 10, 0x8F0CCC92 ); + P( C, D, A, B, 10, 15, 0xFFEFF47D ); + P( B, C, D, A, 1, 21, 0x85845DD1 ); + P( A, B, C, D, 8, 6, 0x6FA87E4F ); + P( D, A, B, C, 15, 10, 0xFE2CE6E0 ); + P( C, D, A, B, 6, 15, 0xA3014314 ); + P( B, C, D, A, 13, 21, 0x4E0811A1 ); + P( A, B, C, D, 4, 6, 0xF7537E82 ); + P( D, A, B, C, 11, 10, 0xBD3AF235 ); + P( C, D, A, B, 2, 15, 0x2AD7D2BB ); + P( B, C, D, A, 9, 21, 0xEB86D391 ); + +#undef F + + ctx->state[0] += A; + ctx->state[1] += B; + ctx->state[2] += C; + ctx->state[3] += D; +} -/* Constants for MD5Transform routine. +/* + * MD5 process buffer */ +void md5_update( md5_context *ctx, unsigned char *input, int ilen ) +{ + int fill; + unsigned long left; + + if( ilen <= 0 ) + return; + + left = ctx->total[0] & 0x3F; + fill = 64 - left; + + ctx->total[0] += ilen; + ctx->total[0] &= 0xFFFFFFFF; + + if( ctx->total[0] < (unsigned long) ilen ) + ctx->total[1]++; + + if( left && ilen >= fill ) + { + memcpy( (void *) (ctx->buffer + left), + (void *) input, fill ); + md5_process( ctx, ctx->buffer ); + input += fill; + ilen -= fill; + left = 0; + } + + while( ilen >= 64 ) + { + md5_process( ctx, input ); + input += 64; + ilen -= 64; + } + + if( ilen > 0 ) + { + memcpy( (void *) (ctx->buffer + left), + (void *) input, ilen ); + } +} -#define S11 7 -#define S12 12 -#define S13 17 -#define S14 22 -#define S21 5 -#define S22 9 -#define S23 14 -#define S24 20 -#define S31 4 -#define S32 11 -#define S33 16 -#define S34 23 -#define S41 6 -#define S42 10 -#define S43 15 -#define S44 21 - -static void MD5Transform(UINT4 [4], unsigned char [64]); -static void Encode(unsigned char *, UINT4 *, unsigned int); -static void Decode(UINT4 *, unsigned char *, unsigned int); - -static unsigned char PADDING[64] = { - 0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 +static const unsigned char md5_padding[64] = +{ + 0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; -/* F, G, H and I are basic MD5 functions. +/* + * MD5 final digest */ -#define F(x, y, z) (((x) & (y)) | ((~x) & (z))) -#define G(x, y, z) (((x) & (z)) | ((y) & (~z))) -#define H(x, y, z) ((x) ^ (y) ^ (z)) -#define I(x, y, z) ((y) ^ ((x) | (~z))) +void md5_finish( md5_context *ctx, unsigned char *output ) +{ + unsigned long last, padn; + unsigned long high, low; + unsigned char msglen[8]; -/* ROTATE_LEFT rotates x left n bits. - */ -#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32-(n)))) + high = ( ctx->total[0] >> 29 ) + | ( ctx->total[1] << 3 ); + low = ( ctx->total[0] << 3 ); -/* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4. -Rotation is separate from addition to prevent recomputation. - */ -#define FF(a, b, c, d, x, s, ac) { \ - (a) += F ((b), (c), (d)) + (x) + (UINT4)(ac); \ - (a) = ROTATE_LEFT ((a), (s)); \ - (a) += (b); \ - } -#define GG(a, b, c, d, x, s, ac) { \ - (a) += G ((b), (c), (d)) + (x) + (UINT4)(ac); \ - (a) = ROTATE_LEFT ((a), (s)); \ - (a) += (b); \ - } -#define HH(a, b, c, d, x, s, ac) { \ - (a) += H ((b), (c), (d)) + (x) + (UINT4)(ac); \ - (a) = ROTATE_LEFT ((a), (s)); \ - (a) += (b); \ - } -#define II(a, b, c, d, x, s, ac) { \ - (a) += I ((b), (c), (d)) + (x) + (UINT4)(ac); \ - (a) = ROTATE_LEFT ((a), (s)); \ - (a) += (b); \ - } - -/* MD5 initialization. Begins an MD5 operation, writing a new context. + PUT_UINT32_LE( low, msglen, 0 ); + PUT_UINT32_LE( high, msglen, 4 ); + + last = ctx->total[0] & 0x3F; + padn = ( last < 56 ) ? ( 56 - last ) : ( 120 - last ); + + md5_update( ctx, (unsigned char *) md5_padding, padn ); + md5_update( ctx, msglen, 8 ); + + PUT_UINT32_LE( ctx->state[0], output, 0 ); + PUT_UINT32_LE( ctx->state[1], output, 4 ); + PUT_UINT32_LE( ctx->state[2], output, 8 ); + PUT_UINT32_LE( ctx->state[3], output, 12 ); +} + +/* + * Output = MD5( input buffer ) */ -void _alpm_MD5Init (context) -MD5_CTX *context; /* context */ +void md5( unsigned char *input, int ilen, + unsigned char *output ) { - context->count[0] = context->count[1] = 0; - /* Load magic initialization constants. -*/ - context->state[0] = 0x67452301; - context->state[1] = 0xefcdab89; - context->state[2] = 0x98badcfe; - context->state[3] = 0x10325476; + md5_context ctx; + + md5_starts( &ctx ); + md5_update( &ctx, input, ilen ); + md5_finish( &ctx, output ); + + memset( &ctx, 0, sizeof( md5_context ) ); } -/* MD5 block update operation. Continues an MD5 message-digest - operation, processing another message block, and updating the - context. +/* + * Output = MD5( file contents ) */ -void _alpm_MD5Update (context, input, inputLen) -MD5_CTX *context; /* context */ -unsigned char *input; /* input block */ -unsigned int inputLen; /* length of input block */ +int md5_file( char *path, unsigned char *output ) { - unsigned int i, index, partLen; - - /* Compute number of bytes mod 64 */ - index = (unsigned int)((context->count[0] >> 3) & 0x3F); + FILE *f; + size_t n; + md5_context ctx; + unsigned char buf[1024]; - /* Update number of bits */ - if ((context->count[0] += ((UINT4)inputLen << 3)) + if( ( f = fopen( path, "rb" ) ) == NULL ) + return( 1 ); - < ((UINT4)inputLen << 3)) - context->count[1]++; - context->count[1] += ((UINT4)inputLen >> 29); + md5_starts( &ctx ); - partLen = 64 - index; + while( ( n = fread( buf, 1, sizeof( buf ), f ) ) > 0 ) + md5_update( &ctx, buf, (int) n ); - /* Transform as many times as possible. -*/ - if (inputLen >= partLen) { - memcpy ((POINTER)&context->buffer[index], (POINTER)input, partLen); - MD5Transform (context->state, context->buffer); + md5_finish( &ctx, output ); - for (i = partLen; i + 63 < inputLen; i += 64) - MD5Transform (context->state, &input[i]); + memset( &ctx, 0, sizeof( md5_context ) ); - index = 0; - } - else - i = 0; + if( ferror( f ) != 0 ) + { + fclose( f ); + return( 2 ); + } - /* Buffer remaining input */ - memcpy ((POINTER)&context->buffer[index], (POINTER)&input[i], inputLen-i); + fclose( f ); + return( 0 ); } -/* MD5 finalization. Ends an MD5 message-digest operation, writing the - the message digest and zeroizing the context. +/* + * MD5 HMAC context setup */ -void _alpm_MD5Final (digest, context) -unsigned char digest[16]; /* message digest */ -MD5_CTX *context; /* context */ +void md5_hmac_starts( md5_context *ctx, + unsigned char *key, int keylen ) { - unsigned char bits[8]; - unsigned int index, padLen; + int i; - /* Save number of bits */ - Encode (bits, context->count, 8); + memset( ctx->ipad, 0x36, 64 ); + memset( ctx->opad, 0x5C, 64 ); - /* Pad out to 56 mod 64. -*/ - index = (unsigned int)((context->count[0] >> 3) & 0x3f); - padLen = (index < 56) ? (56 - index) : (120 - index); - _alpm_MD5Update (context, PADDING, padLen); + for( i = 0; i < keylen; i++ ) + { + if( i >= 64 ) break; - /* Append length (before padding) */ - _alpm_MD5Update (context, bits, 8); + ctx->ipad[i] ^= key[i]; + ctx->opad[i] ^= key[i]; + } - /* Store state in digest */ - Encode (digest, context->state, 16); - - /* Zeroize sensitive information. -*/ - memset ((POINTER)context, 0, sizeof (*context)); + md5_starts( ctx ); + md5_update( ctx, ctx->ipad, 64 ); } -/* MD5 basic transformation. Transforms state based on block. +/* + * MD5 HMAC process buffer */ -static void MD5Transform (state, block) -UINT4 state[4]; -unsigned char block[64]; +void md5_hmac_update( md5_context *ctx, + unsigned char *input, int ilen ) { - UINT4 a = state[0], b = state[1], c = state[2], d = state[3], x[16]; - - Decode (x, block, 64); - - /* Round 1 */ - FF (a, b, c, d, x[ 0], S11, 0xd76aa478); /* 1 */ - FF (d, a, b, c, x[ 1], S12, 0xe8c7b756); /* 2 */ - FF (c, d, a, b, x[ 2], S13, 0x242070db); /* 3 */ - FF (b, c, d, a, x[ 3], S14, 0xc1bdceee); /* 4 */ - FF (a, b, c, d, x[ 4], S11, 0xf57c0faf); /* 5 */ - FF (d, a, b, c, x[ 5], S12, 0x4787c62a); /* 6 */ - FF (c, d, a, b, x[ 6], S13, 0xa8304613); /* 7 */ - FF (b, c, d, a, x[ 7], S14, 0xfd469501); /* 8 */ - FF (a, b, c, d, x[ 8], S11, 0x698098d8); /* 9 */ - FF (d, a, b, c, x[ 9], S12, 0x8b44f7af); /* 10 */ - FF (c, d, a, b, x[10], S13, 0xffff5bb1); /* 11 */ - FF (b, c, d, a, x[11], S14, 0x895cd7be); /* 12 */ - FF (a, b, c, d, x[12], S11, 0x6b901122); /* 13 */ - FF (d, a, b, c, x[13], S12, 0xfd987193); /* 14 */ - FF (c, d, a, b, x[14], S13, 0xa679438e); /* 15 */ - FF (b, c, d, a, x[15], S14, 0x49b40821); /* 16 */ - - /* Round 2 */ - GG (a, b, c, d, x[ 1], S21, 0xf61e2562); /* 17 */ - GG (d, a, b, c, x[ 6], S22, 0xc040b340); /* 18 */ - GG (c, d, a, b, x[11], S23, 0x265e5a51); /* 19 */ - GG (b, c, d, a, x[ 0], S24, 0xe9b6c7aa); /* 20 */ - GG (a, b, c, d, x[ 5], S21, 0xd62f105d); /* 21 */ - GG (d, a, b, c, x[10], S22, 0x2441453); /* 22 */ - GG (c, d, a, b, x[15], S23, 0xd8a1e681); /* 23 */ - GG (b, c, d, a, x[ 4], S24, 0xe7d3fbc8); /* 24 */ - GG (a, b, c, d, x[ 9], S21, 0x21e1cde6); /* 25 */ - GG (d, a, b, c, x[14], S22, 0xc33707d6); /* 26 */ - GG (c, d, a, b, x[ 3], S23, 0xf4d50d87); /* 27 */ - - GG (b, c, d, a, x[ 8], S24, 0x455a14ed); /* 28 */ - GG (a, b, c, d, x[13], S21, 0xa9e3e905); /* 29 */ - GG (d, a, b, c, x[ 2], S22, 0xfcefa3f8); /* 30 */ - GG (c, d, a, b, x[ 7], S23, 0x676f02d9); /* 31 */ - GG (b, c, d, a, x[12], S24, 0x8d2a4c8a); /* 32 */ - - /* Round 3 */ - HH (a, b, c, d, x[ 5], S31, 0xfffa3942); /* 33 */ - HH (d, a, b, c, x[ 8], S32, 0x8771f681); /* 34 */ - HH (c, d, a, b, x[11], S33, 0x6d9d6122); /* 35 */ - HH (b, c, d, a, x[14], S34, 0xfde5380c); /* 36 */ - HH (a, b, c, d, x[ 1], S31, 0xa4beea44); /* 37 */ - HH (d, a, b, c, x[ 4], S32, 0x4bdecfa9); /* 38 */ - HH (c, d, a, b, x[ 7], S33, 0xf6bb4b60); /* 39 */ - HH (b, c, d, a, x[10], S34, 0xbebfbc70); /* 40 */ - HH (a, b, c, d, x[13], S31, 0x289b7ec6); /* 41 */ - HH (d, a, b, c, x[ 0], S32, 0xeaa127fa); /* 42 */ - HH (c, d, a, b, x[ 3], S33, 0xd4ef3085); /* 43 */ - HH (b, c, d, a, x[ 6], S34, 0x4881d05); /* 44 */ - HH (a, b, c, d, x[ 9], S31, 0xd9d4d039); /* 45 */ - HH (d, a, b, c, x[12], S32, 0xe6db99e5); /* 46 */ - HH (c, d, a, b, x[15], S33, 0x1fa27cf8); /* 47 */ - HH (b, c, d, a, x[ 2], S34, 0xc4ac5665); /* 48 */ - - /* Round 4 */ - II (a, b, c, d, x[ 0], S41, 0xf4292244); /* 49 */ - II (d, a, b, c, x[ 7], S42, 0x432aff97); /* 50 */ - II (c, d, a, b, x[14], S43, 0xab9423a7); /* 51 */ - II (b, c, d, a, x[ 5], S44, 0xfc93a039); /* 52 */ - II (a, b, c, d, x[12], S41, 0x655b59c3); /* 53 */ - II (d, a, b, c, x[ 3], S42, 0x8f0ccc92); /* 54 */ - II (c, d, a, b, x[10], S43, 0xffeff47d); /* 55 */ - II (b, c, d, a, x[ 1], S44, 0x85845dd1); /* 56 */ - II (a, b, c, d, x[ 8], S41, 0x6fa87e4f); /* 57 */ - II (d, a, b, c, x[15], S42, 0xfe2ce6e0); /* 58 */ - II (c, d, a, b, x[ 6], S43, 0xa3014314); /* 59 */ - II (b, c, d, a, x[13], S44, 0x4e0811a1); /* 60 */ - II (a, b, c, d, x[ 4], S41, 0xf7537e82); /* 61 */ - II (d, a, b, c, x[11], S42, 0xbd3af235); /* 62 */ - II (c, d, a, b, x[ 2], S43, 0x2ad7d2bb); /* 63 */ - II (b, c, d, a, x[ 9], S44, 0xeb86d391); /* 64 */ - - state[0] += a; - state[1] += b; - state[2] += c; - state[3] += d; - - /* Zeroize sensitive information. - -*/ - memset ((POINTER)x, 0, sizeof (x)); + md5_update( ctx, input, ilen ); } -/* Encodes input (UINT4) into output (unsigned char). Assumes len is - a multiple of 4. +/* + * MD5 HMAC final digest */ -static void Encode (output, input, len) -unsigned char *output; -UINT4 *input; -unsigned int len; +void md5_hmac_finish( md5_context *ctx, unsigned char *output ) { - unsigned int i, j; - - for (i = 0, j = 0; j < len; i++, j += 4) { - output[j] = (unsigned char)(input[i] & 0xff); - output[j+1] = (unsigned char)((input[i] >> 8) & 0xff); - output[j+2] = (unsigned char)((input[i] >> 16) & 0xff); - output[j+3] = (unsigned char)((input[i] >> 24) & 0xff); - } + unsigned char tmpbuf[16]; + + md5_finish( ctx, tmpbuf ); + md5_starts( ctx ); + md5_update( ctx, ctx->opad, 64 ); + md5_update( ctx, tmpbuf, 16 ); + md5_finish( ctx, output ); + + memset( tmpbuf, 0, sizeof( tmpbuf ) ); } -/* Decodes input (unsigned char) into output (UINT4). Assumes len is - a multiple of 4. +/* + * Output = HMAC-MD5( hmac key, input buffer ) */ -static void Decode (output, input, len) -UINT4 *output; -unsigned char *input; -unsigned int len; +void md5_hmac( unsigned char *key, int keylen, + unsigned char *input, int ilen, + unsigned char *output ) { - unsigned int i, j; + md5_context ctx; + + md5_hmac_starts( &ctx, key, keylen ); + md5_hmac_update( &ctx, input, ilen ); + md5_hmac_finish( &ctx, output ); - for (i = 0, j = 0; j < len; i++, j += 4) - output[i] = ((UINT4)input[j]) | (((UINT4)input[j+1]) << 8) | - (((UINT4)input[j+2]) << 16) | (((UINT4)input[j+3]) << 24); + memset( &ctx, 0, sizeof( md5_context ) ); } -/* vim: set ts=2 sw=2 noet: */ +static const char _md5_src[] = "_md5_src"; + +#if defined(SELF_TEST) +/* + * RFC 1321 test vectors + */ +static const char md5_test_str[7][81] = +{ + { "" }, + { "a" }, + { "abc" }, + { "message digest" }, + { "abcdefghijklmnopqrstuvwxyz" }, + { "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789" }, + { "12345678901234567890123456789012345678901234567890123456789012" \ + "345678901234567890" } +}; + +static const unsigned char md5_test_sum[7][16] = +{ + { 0xD4, 0x1D, 0x8C, 0xD9, 0x8F, 0x00, 0xB2, 0x04, + 0xE9, 0x80, 0x09, 0x98, 0xEC, 0xF8, 0x42, 0x7E }, + { 0x0C, 0xC1, 0x75, 0xB9, 0xC0, 0xF1, 0xB6, 0xA8, + 0x31, 0xC3, 0x99, 0xE2, 0x69, 0x77, 0x26, 0x61 }, + { 0x90, 0x01, 0x50, 0x98, 0x3C, 0xD2, 0x4F, 0xB0, + 0xD6, 0x96, 0x3F, 0x7D, 0x28, 0xE1, 0x7F, 0x72 }, + { 0xF9, 0x6B, 0x69, 0x7D, 0x7C, 0xB7, 0x93, 0x8D, + 0x52, 0x5A, 0x2F, 0x31, 0xAA, 0xF1, 0x61, 0xD0 }, + { 0xC3, 0xFC, 0xD3, 0xD7, 0x61, 0x92, 0xE4, 0x00, + 0x7D, 0xFB, 0x49, 0x6C, 0xCA, 0x67, 0xE1, 0x3B }, + { 0xD1, 0x74, 0xAB, 0x98, 0xD2, 0x77, 0xD9, 0xF5, + 0xA5, 0x61, 0x1C, 0x2C, 0x9F, 0x41, 0x9D, 0x9F }, + { 0x57, 0xED, 0xF4, 0xA2, 0x2B, 0xE3, 0xC9, 0x55, + 0xAC, 0x49, 0xDA, 0x2E, 0x21, 0x07, 0xB6, 0x7A } +}; + +/* + * Checkup routine + */ +int md5_self_test( int verbose ) +{ + int i; + unsigned char md5sum[16]; + + for( i = 0; i < 7; i++ ) + { + if( verbose != 0 ) + printf( " MD5 test #%d: ", i + 1 ); + + md5( (unsigned char *) md5_test_str[i], + strlen( md5_test_str[i] ), md5sum ); + + if( memcmp( md5sum, md5_test_sum[i], 16 ) != 0 ) + { + if( verbose != 0 ) + printf( "failed\n" ); + + return( 1 ); + } + + if( verbose != 0 ) + printf( "passed\n" ); + } + + if( verbose != 0 ) + printf( "\n" ); + + return( 0 ); +} +#else +int md5_self_test( int verbose ) +{ + return( 0 ); +} +#endif diff --git a/lib/libalpm/md5.h b/lib/libalpm/md5.h index 8ae324e..8699e0d 100644 --- a/lib/libalpm/md5.h +++ b/lib/libalpm/md5.h @@ -1,53 +1,121 @@ -/* MD5.H - header file for MD5C.C +/** + * \file md5.h */ +#ifndef _MD5_H +#define _MD5_H -/* Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All -rights reserved. +#ifdef __cplusplus +extern "C" { +#endif -License to copy and use this software is granted provided that it -is identified as the "RSA Data Security, Inc. MD5 Message-Digest -Algorithm" in all material mentioning or referencing this software -or this function. +/** + * \brief MD5 context structure + */ +typedef struct +{ + unsigned long total[2]; /*!< number of bytes processed */ + unsigned long state[4]; /*!< intermediate digest state */ + unsigned char buffer[64]; /*!< data block being processed */ + unsigned char ipad[64]; /*!< HMAC: inner padding */ + unsigned char opad[64]; /*!< HMAC: outer padding */ +} +md5_context; -License is also granted to make and use derivative works provided -that such works are identified as "derived from the RSA Data -Security, Inc. MD5 Message-Digest Algorithm" in all material -mentioning or referencing the derived work. +/** + * \brief MD5 context setup + * + * \param ctx context to be initialized + */ +void md5_starts( md5_context *ctx ); -RSA Data Security, Inc. makes no representations concerning either -the merchantability of this software or the suitability of this -software for any particular purpose. It is provided "as is" -without express or implied warranty of any kind. +/** + * \brief MD5 process buffer + * + * \param ctx MD5 context + * \param input buffer holding the data + * \param ilen length of the input data + */ +void md5_update( md5_context *ctx, unsigned char *input, int ilen ); -These notices must be retained in any copies of any part of this -documentation and/or software. */ -#ifndef _ALPM_MD5_H -#define _ALPM_MD5_H +/** + * \brief MD5 final digest + * + * \param ctx MD5 context + * \param output MD5 checksum result + */ +void md5_finish( md5_context *ctx, unsigned char *output ); -/* POINTER defines a generic pointer type */ -typedef unsigned char *POINTER; +/** + * \brief Output = MD5( input buffer ) + * + * \param input buffer holding the data + * \param ilen length of the input data + * \param output MD5 checksum result + */ +void md5( unsigned char *input, int ilen, + unsigned char *output ); -/* UINT2 defines a two byte word */ -typedef unsigned short int UINT2; +/** + * \brief Output = MD5( file contents ) + * + * \param path input file name + * \param output MD5 checksum result + * + * \return 0 if successful, 1 if fopen failed, + * or 2 if fread failed + */ +int md5_file( char *path, unsigned char *output ); -/* UINT4 defines a four byte word */ -typedef unsigned int UINT4; +/** + * \brief MD5 HMAC context setup + * + * \param ctx HMAC context to be initialized + * \param key HMAC secret key + * \param keylen length of the HMAC key + */ +void md5_hmac_starts( md5_context *ctx, + unsigned char *key, int keylen ); +/** + * \brief MD5 HMAC process buffer + * + * \param ctx HMAC context + * \param input buffer holding the data + * \param ilen length of the input data + */ +void md5_hmac_update( md5_context *ctx, + unsigned char *input, int ilen ); -/* MD5 context. */ -typedef struct { - UINT4 state[4]; /* state (ABCD) */ - UINT4 count[2]; /* number of bits, modulo 2^64 (lsb first) */ - unsigned char buffer[64]; /* input buffer */ -} MD5_CTX; +/** + * \brief MD5 HMAC final digest + * + * \param ctx HMAC context + * \param output MD5 HMAC checksum result + */ +void md5_hmac_finish( md5_context *ctx, unsigned char *output ); -void _alpm_MD5Init(MD5_CTX *); -void _alpm_MD5Update(MD5_CTX *, unsigned char *, unsigned int); -void _alpm_MD5Final(unsigned char [16], MD5_CTX *); +/** + * \brief Output = HMAC-MD5( hmac key, input buffer ) + * + * \param key HMAC secret key + * \param keylen length of the HMAC key + * \param input buffer holding the data + * \param ilen length of the input data + * \param output HMAC-MD5 result + */ +void md5_hmac( unsigned char *key, int keylen, + unsigned char *input, int ilen, + unsigned char *output ); -char* _alpm_MDFile(char *); -void _alpm_MDPrint(unsigned char [16]); +/** + * \brief Checkup routine + * + * \return 0 if successful, or 1 if the test failed + */ +int md5_self_test( int verbose ); -#endif /* _ALPM_MD5_H */ +#ifdef __cplusplus +} +#endif -/* vim: set ts=2 sw=2 noet: */ +#endif /* md5.h */ diff --git a/lib/libalpm/md5driver.c b/lib/libalpm/md5driver.c deleted file mode 100644 index caeddc1..0000000 --- a/lib/libalpm/md5driver.c +++ /dev/null @@ -1,93 +0,0 @@ -/* MD5DRIVER.C - taken and modified from MDDRIVER.C (license below) */ -/* for use in pacman. */ -/*********************************************************************/ - -/* Copyright (C) 1990-2, RSA Data Security, Inc. Created 1990. All -rights reserved. - -RSA Data Security, Inc. makes no representations concerning either -the merchantability of this software or the suitability of this -software for any particular purpose. It is provided "as is" -without express or implied warranty of any kind. - -These notices must be retained in any copies of any part of this -documentation and/or software. - */ - -/* The following makes MD default to MD5 if it has not already been - defined with C compiler flags. - */ -#define MD MD5 - -#include "config.h" - -#include <stdlib.h> -#include <stdio.h> -#include <string.h> - -/* libalpm */ -#include "alpm.h" -#include "log.h" -#include "util.h" -#include "md5.h" - -/* Length of test block, number of test blocks. - */ -#define TEST_BLOCK_LEN 1000 -#define TEST_BLOCK_COUNT 1000 - -#define MD_CTX MD5_CTX -#define MDInit _alpm_MD5Init -#define MDUpdate _alpm_MD5Update -#define MDFinal _alpm_MD5Final - -/** Get the md5 sum of file. - * @param name name of the file - * @return the checksum on success, NULL on error - * @addtogroup alpm_misc - */ -char SYMEXPORT *alpm_get_md5sum(char *name) -{ - ALPM_LOG_FUNC; - - ASSERT(name != NULL, return(NULL)); - - return(_alpm_MDFile(name)); -} - -char* _alpm_MDFile(char *filename) -{ - FILE *file; - MD_CTX context; - int len; - char hex[3]; - unsigned char buffer[1024], digest[16]; - - ALPM_LOG_FUNC; - - if((file = fopen(filename, "rb")) == NULL) { - _alpm_log(PM_LOG_ERROR, _("md5: %s can't be opened\n"), filename); - } else { - char *ret; - int i; - - MDInit(&context); - while((len = fread(buffer, 1, 1024, file))) { - MDUpdate(&context, buffer, len); - } - MDFinal(digest, &context); - fclose(file); - - ret = calloc(33, sizeof(char)); - for(i = 0; i < 16; i++) { - snprintf(hex, 3, "%02x", digest[i]); - strncat(ret, hex, 2); - } - - _alpm_log(PM_LOG_DEBUG, "md5(%s) = %s", filename, ret); - return(ret); - } - return(NULL); -} - -/* vim: set ts=2 sw=2 noet: */ diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 22d209a..cf33d83 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -39,7 +39,6 @@ #include "trans.h" #include "util.h" #include "error.h" -#include "md5.h" #include "log.h" #include "backup.h" #include "package.h" diff --git a/lib/libalpm/sync.c b/lib/libalpm/sync.c index a456aae..c083292 100644 --- a/lib/libalpm/sync.c +++ b/lib/libalpm/sync.c @@ -47,7 +47,6 @@ #include "util.h" #include "handle.h" #include "alpm.h" -#include "md5.h" #include "server.h" pmsyncpkg_t *_alpm_sync_new(int type, pmpkg_t *spkg, void *data) diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 5ead8a2..5580690 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -48,6 +48,7 @@ #include "error.h" #include "package.h" #include "alpm.h" +#include "md5.h" #ifndef HAVE_STRVERSCMP /* GNU's strverscmp() function, taken from glibc 2.3.2 sources @@ -540,5 +541,35 @@ int _alpm_str_cmp(const void *s1, const void *s2) return(strcmp(s1, s2)); } +/** Get the md5 sum of file. + * @param filename name of the file + * @return the checksum on success, NULL on error + * @addtogroup alpm_misc + */ +char SYMEXPORT *alpm_get_md5sum(char *filename) +{ + unsigned char *md5sum = NULL; + + ALPM_LOG_FUNC; + + ASSERT(filename != NULL, return(NULL)); + + md5sum = malloc(32); + int ret = md5_file(filename, md5sum); + + if (ret > 0) { + if (ret == 1) + _alpm_log(PM_LOG_ERROR, _("md5: %s can't be opened\n"), filename); + else if (ret == 2) + _alpm_log(PM_LOG_ERROR, _("md5: %s can't be read\n"), filename); + + return(NULL); + } + + _alpm_log(PM_LOG_DEBUG, "md5(%s) = %s", filename, md5sum); + return((char *)md5sum); +} + + /* vim: set ts=2 sw=2 noet: */ -- 1.5.2.4
On 7/25/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
* Move alpm md5 functions to lib/libalpm/util.c * Remove unneeded includes for md5.h * Replace md5 implementation with one from http://www.xyssl.org
Signed-off-by: Andrew Fyfe <andrew@neptune-one.net> --- lib/libalpm/Makefile.am | 1 - lib/libalpm/add.c | 7 +- lib/libalpm/md5.c | 661 +++++++++++++++++++++++++++++------------------ lib/libalpm/md5.h | 144 ++++++++--- lib/libalpm/md5driver.c | 93 ------- lib/libalpm/remove.c | 1 - lib/libalpm/sync.c | 1 - lib/libalpm/util.c | 31 +++ 8 files changed, 547 insertions(+), 392 deletions(-)
I've now pulled both the SHA1 removal patch and this one into my working branch. However, this one needed a few fixes which should be reflected in the diff I hacked up below. Two major things to point out in the diff: 1. Even on one line if/for/looping statements, use {}. This is pacman coding style and helps keep us consistent, and it cuts out stupid bugs. 2. Watch your mallocs, and use calloc when possible. You didn't allocate space for the null byte, so you were overrunning your buffers when you filled them and the free() failed when using mtrace(). I switched to calloc usage, and now use sprintf because this is a case where we can do that- it is faster and we aren't worried about running out of room. We then need to take care of the null byte ourselves, however. I'll give you a break on some of this because you are venturing into C code where few have gone before, and you probably weren't aware of the rules. I think this is the most recent version of them: http://www.archlinux.org/~aaron/pacman-coding.html Finally, I cleaned up the imported md5.c/md5.h from XySSL a bit. I removed the HMAC and SELF_CHECK stuff we won't use, as well as threw the LGPL header at the top of md5.h and put instructions for upgrading the md5 routines in md5.c. -Dan diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 0f47e90..5f43117 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -550,26 +551,32 @@ char SYMEXPORT *alpm_get_md5sum(char *filename) { unsigned char output[16]; char *md5sum; + int ret, i; ALPM_LOG_FUNC; ASSERT(filename != NULL, return(NULL)); - md5sum = (char*)malloc(32); - int ret = md5_file(filename, output); + /* allocate 32 chars plus 1 for null */ + md5sum = calloc(33, sizeof(char)); + ret = md5_file(filename, output); if (ret > 0) { - if (ret == 1) + if (ret == 1) { _alpm_log(PM_LOG_ERROR, _("md5: %s can't be opened\n"), filename); - else if (ret == 2) + } else if (ret == 2) { _alpm_log(PM_LOG_ERROR, _("md5: %s can't be read\n"), filename); + } return(NULL); } /* Convert the result to something readable */ - for (unsigned int i = 0; i < 16; i++) - snprintf(md5sum + i * 2, 33, "%02x", output[i]); + for (i = 0; i < 16; i++) { + /* sprintf is acceptable here because we know our output */ + sprintf(md5sum +(i * 2), "%02x", output[i]); + } + md5sum[32] = '\0'; _alpm_log(PM_LOG_DEBUG, "md5(%s) = %s", filename, md5sum); return(md5sum);
On Thu, Aug 16, 2007 at 01:36:31PM -0400, Dan McGee wrote:
2. Watch your mallocs, and use calloc when possible. You didn't allocate space for the null byte, so you were overrunning your buffers when you filled them and the free() failed when using mtrace(). I switched to calloc usage, and now use sprintf because this is a case where we can do that- it is faster and we aren't worried about running out of room. We then need to take care of the null byte ourselves, however.
Why is it better to use calloc? And when using calloc, is it still needed to set the null byte a second time?
On 8/16/07, Xavier <shiningxc@gmail.com> wrote:
On Thu, Aug 16, 2007 at 01:36:31PM -0400, Dan McGee wrote:
2. Watch your mallocs, and use calloc when possible. You didn't allocate space for the null byte, so you were overrunning your buffers when you filled them and the free() failed when using mtrace(). I switched to calloc usage, and now use sprintf because this is a case where we can do that- it is faster and we aren't worried about running out of room. We then need to take care of the null byte ourselves, however.
Why is it better to use calloc? And when using calloc, is it still needed to set the null byte a second time?
Calloc initializes any memory to zero, which can help solve debugging issues a lot easier when something goes wrong. In addition, it is very clear how much and of what size memory you need with it. Regarding the null byte thing- calloc zeros the memory, and while I would assume that is the same as the null byte, I don't want to take it for granted, so I figured I'd set it explicitly anyway. -Dan
On Thu, 16 Aug 2007 14:45:40 -0400 "Dan McGee" <dpmcgee@gmail.com> wrote:
Regarding the null byte thing- calloc zeros the memory, and while I would assume that is the same as the null byte,
It is. Setting the end of the string to '\0' (which == 0) is unnecessary - but it won't waste much CPU time and it prevents issues in the future (if it's ever changed from calloc to malloc again), so I don't see a problem with it being there. -- Travis
On Thu, Aug 16, 2007 at 06:09:58PM -0400, Travis Willard wrote:
On Thu, 16 Aug 2007 14:45:40 -0400 "Dan McGee" <dpmcgee@gmail.com> wrote:
Regarding the null byte thing- calloc zeros the memory, and while I would assume that is the same as the null byte,
It is. Setting the end of the string to '\0' (which == 0) is unnecessary - but it won't waste much CPU time and it prevents issues in the future (if it's ever changed from calloc to malloc again), so I don't see a problem with it being there.
Right, fair enough ;)
On 7/25/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
There's no need for a second hashing algorithm. MD5 serves the purpose of verifying that a package file hasn't been corrupted during download.
Signed-off-by: Andrew Fyfe <andrew@neptune-one.net>
So I've been thinking this one over for a while. On one hand, I agree with the thought. For sure, I think we don't need more than one hashing algorithm. The only real question is whether we should switch to sha1 or not. If no, then this sequence of two patches should be applied. What I really want to hear are thoughts on this issue. We are using md5sums for two main reasons- verification of package downloads, and determining whether a backup file has changed. With this in mind, I think md5 is sufficient to serve our needs. Please chime in on this. -Dan
On 8/15/07, Dan McGee <dpmcgee@gmail.com> wrote:
What I really want to hear are thoughts on this issue. We are using md5sums for two main reasons- verification of package downloads, and determining whether a backup file has changed. With this in mind, I think md5 is sufficient to serve our needs.
Please chime in on this.
There is some history on this somewhere in these list archives. I'll summarize my views because I don't want to figure out what thread that was. a) The "md5 is insecure" argument doesn't hold water with archive formats. Reproducing an md5sum with a malicious file requires that the original file format supports null padding. All of the examples I've seen used ps files as you can embed null padding to fluff the md5sum. In our case, if you add some padding, it suddenly becomes a corrupt archive. Corrupt archives are already checked for before extraction, so if the md5sum matches AND it's corrupt, it's either a packager's error, or malicious. b) We are not using md5 for security. We are using it for integrity. These are two totally different things. Instead of saying "I don't trust you Mr Mirror", we're saying "I trust the DB file is correct, did this download ok". See now there's a subtle problem with this point. If we want to implicitly trust the DB files, then we need to ensure where they come from. DB files on mirrors might not be "trustable". /me shrugs But my opinions is thus: md5 is faster than sha1, and we're just ensuring that we downloaded the file exactly as the server told us to. We are not guaranteeing that it is super-duper secure. If we wanted that, we'd sign packages. I vote md5
Aaron Griffin wrote:
On 8/15/07, Dan McGee <dpmcgee@gmail.com> wrote:
What I really want to hear are thoughts on this issue. We are using md5sums for two main reasons- verification of package downloads, and determining whether a backup file has changed. With this in mind, I think md5 is sufficient to serve our needs.
Please chime in on this.
There is some history on this somewhere in these list archives. I'll summarize my views because I don't want to figure out what thread that was.
a) The "md5 is insecure" argument doesn't hold water with archive formats. Reproducing an md5sum with a malicious file requires that the original file format supports null padding. All of the examples I've seen used ps files as you can embed null padding to fluff the md5sum. In our case, if you add some padding, it suddenly becomes a corrupt archive. Corrupt archives are already checked for before extraction, so if the md5sum matches AND it's corrupt, it's either a packager's error, or malicious. b) We are not using md5 for security. We are using it for integrity. These are two totally different things. Instead of saying "I don't trust you Mr Mirror", we're saying "I trust the DB file is correct, did this download ok". See now there's a subtle problem with this point. If we want to implicitly trust the DB files, then we need to ensure where they come from. DB files on mirrors might not be "trustable". /me shrugs
But my opinions is thus: md5 is faster than sha1, and we're just ensuring that we downloaded the file exactly as the server told us to. We are not guaranteeing that it is super-duper secure. If we wanted that, we'd sign packages. I vote md5
_______________________________________________ pacman-dev mailing list pacman-dev@archlinux.org http://archlinux.org/mailman/listinfo/pacman-dev +1 here
I've made a few tweaks to the patch... http://neptune-one.homeip.net/git?p=pacman;a=shortlog;h=ready_to_pull Andrew
On 8/16/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
Aaron Griffin wrote:
On 8/15/07, Dan McGee <dpmcgee@gmail.com> wrote:
What I really want to hear are thoughts on this issue. We are using md5sums for two main reasons- verification of package downloads, and determining whether a backup file has changed. With this in mind, I think md5 is sufficient to serve our needs.
Please chime in on this.
There is some history on this somewhere in these list archives. I'll summarize my views because I don't want to figure out what thread that was.
a) The "md5 is insecure" argument doesn't hold water with archive formats. Reproducing an md5sum with a malicious file requires that the original file format supports null padding. All of the examples I've seen used ps files as you can embed null padding to fluff the md5sum. In our case, if you add some padding, it suddenly becomes a corrupt archive. Corrupt archives are already checked for before extraction, so if the md5sum matches AND it's corrupt, it's either a packager's error, or malicious. b) We are not using md5 for security. We are using it for integrity. These are two totally different things. Instead of saying "I don't trust you Mr Mirror", we're saying "I trust the DB file is correct, did this download ok". See now there's a subtle problem with this point. If we want to implicitly trust the DB files, then we need to ensure where they come from. DB files on mirrors might not be "trustable". /me shrugs
But my opinions is thus: md5 is faster than sha1, and we're just ensuring that we downloaded the file exactly as the server told us to. We are not guaranteeing that it is super-duper secure. If we wanted that, we'd sign packages. I vote md5
_______________________________________________ pacman-dev mailing list pacman-dev@archlinux.org http://archlinux.org/mailman/listinfo/pacman-dev +1 here
I've made a few tweaks to the patch... http://neptune-one.homeip.net/git?p=pacman;a=shortlog;h=ready_to_pull
The diffstat on that patch is exactly the same as the one that was in this email. Has it really changed? I'm just referring to the "Remove SHA1" patch, not the "cleanup MD5sum" one. -Dan
Hello, Na Wed, Aug 15, 2007 at 10:55:20PM -0400, Dan McGee <dpmcgee@gmail.com> pisal(a):
with the thought. For sure, I think we don't need more than one hashing algorithm.
FreeBSD uses 3.. :) - VMiklos
On 7/3/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
I asked this question a while ago about makepkg now I'm asking about pacman... why do we need support for multiple checksum types? What's wrong with md5?
My problem is more with the fact that we have 5 functions and 1 field in pmpkg_t for each checksum and we have to do
Hey there. Maybe you are using an "outdated" code piece in pacman3 arch version. When frugalware team contributed back into pacman then there were some discussion about md5->sha1 change. We used multiple checksum (for md5 and for sha1) because of backward compatibility. So when we switched from md5 to sha1 then users did not noticed anything. Because pacman supported md5 and sha1 too. Now in pacman-g2 this md5 code part isn't neccessary anymore. And i think at you (pacman3) dont need this one if you wont want to switch to sha1. So afterall: multiple checksum routins (md5 and sha1 in same time) was in the code because of backward compatibility. (we did this way too when we switched from gzipped packages to bzip2ed packages) Regards -krix-
On 7/3/07, Andrew Fyfe <andrew@neptune-one.net> wrote:
I asked this question a while ago about makepkg now I'm asking about pacman... why do we need support for multiple checksum types? What's wrong with md5?
Frugalware switched to using sha1sums, so that is a big reason. If you can find a good reason to pull support for it, let us know, but as of right now I don't see a reason to remove it, even if we don't use it. Any other distribution using pacman may want to use it, so the option is there. -Dan
Na Tue, Jul 03, 2007 at 03:55:06PM -0400, Dan McGee <dpmcgee@gmail.com> pisal(a):
Frugalware switched to using sha1sums, so that is a big reason. If you can find a good reason to pull support for it, let us know, but as of right now I don't see a reason to remove it, even if we don't use it. Any other distribution using pacman may want to use it, so the option is there.
i think if you generate those arrays using makepkg -g then it doesn't make sense. it's useful when upstream provides sha1sums in the website then you can copy&paste it to the buildscript thanks, VMiklos -- developer of Frugalware Linux - http://frugalware.org
participants (10)
-
Aaron Griffin
-
Andrew Fyfe
-
Christian Hamar [krix]
-
Dan McGee
-
Jason Chu
-
Jeff Mickey
-
Mateusz Jedrasik
-
Travis Willard
-
VMiklos
-
Xavier