[pacman-dev] Crypto algorithms
We have a rather ragtag set of cryptography functions in use right now, and weird files being used (md5driver? wtf). Looking around I came across these LGPLed hash libraries that we can probably use in pacman. BeeCrypt- http://directory.fsf.org/security/crypt/BeeCrypt.html Documentation- http://beecrypt.sourceforge.net/doxygen/c/index.html Any reason not to go for this? I haven't looked super closely but I think it would make sense to clean this up a bit in order to allow for any future extensibility we may want. These seem like cleanly written algorithms that we can drop in to our code base as they are LGPL. Thoughts appreciated. Have to start cleaning somewhere. -Dan
2007/4/9, Dan McGee <dpmcgee@gmail.com>:
We have a rather ragtag set of cryptography functions in use right now, and weird files being used (md5driver? wtf). Looking around I came across these LGPLed hash libraries that we can probably use in pacman.
BeeCrypt- http://directory.fsf.org/security/crypt/BeeCrypt.html Documentation- http://beecrypt.sourceforge.net/doxygen/c/index.html
Any reason not to go for this? I haven't looked super closely but I think it would make sense to clean this up a bit in order to allow for any future extensibility we may want. These seem like cleanly written algorithms that we can drop in to our code base as they are LGPL.
Thoughts appreciated. Have to start cleaning somewhere.
Wow, this lib is exactly what I was looking for long time ago. Only useful algorithms, portable, and nice API. Looks very good candidate. -- Roman Kyrylych (Роман Кирилич)
Na Sun, Apr 08, 2007 at 08:21:01PM -0400, Dan McGee <dpmcgee@gmail.com> pisal(a):
We have a rather ragtag set of cryptography functions in use right now, and weird files being used (md5driver? wtf).
md5driver is _the_ standard md5 hashing algorithm, just like the sha1 one. and of course it's portable after krix reverted some (probably accident) changes Judd made to the original one. grep for md5.*x86_64 in /usr/share/doc/pacman-*/NEWS. oh, you remove documentation sorry :( i hope you'll switch to libcurl instead of "weird files being used (libdownload? wtf)" soon or maybe using kioslave would be even better ;) VMiklos -- developer of Frugalware Linux - http://frugalware.org
2007/4/9, VMiklos <vmiklos@frugalware.org>:
Na Sun, Apr 08, 2007 at 08:21:01PM -0400, Dan McGee <dpmcgee@gmail.com> pisal(a):
We have a rather ragtag set of cryptography functions in use right now, and weird files being used (md5driver? wtf).
md5driver is _the_ standard md5 hashing algorithm, just like the sha1 one. and of course it's portable
So what? We may use beecrypt. It will even allow to do package signing.
i hope you'll switch to libcurl instead of "weird files being used (libdownload? wtf)" soon
Why??? curl is _much_ heavier. libdownload is a Linux port of BSD's libfetch, which is nice library.
or maybe using kioslave would be even better ;)
use it yourself ;) -- Roman Kyrylych (Роман Кирилич)
On 4/9/07, VMiklos <vmiklos@frugalware.org> wrote:
Na Sun, Apr 08, 2007 at 08:21:01PM -0400, Dan McGee <dpmcgee@gmail.com> pisal(a):
We have a rather ragtag set of cryptography functions in use right now, and weird files being used (md5driver? wtf).
md5driver is _the_ standard md5 hashing algorithm, just like the sha1 one. and of course it's portable after krix reverted some (probably accident) changes Judd made to the original one. grep for md5.*x86_64 in /usr/share/doc/pacman-*/NEWS. oh, you remove documentation sorry :(
One of the biggest issues I had was this code in sha1.h: /* TODO check this comment */ /* The code below is from md5.h (from coreutils), little modifications */ #define UINT_MAX_32_BITS 4294967295U /* This new ifdef allows splint to not fail on its static code check */ #ifdef S_SPLINT_S typedef unsigned int sha_uint32; #else #if UINT_MAX == UINT_MAX_32_BITS typedef unsigned int sha_uint32; #else #if USHRT_MAX == UINT_MAX_32_BITS typedef unsigned short sha_uint32; #else #if ULONG_MAX == UINT_MAX_32_BITS typedef unsigned long sha_uint32; #else /* The following line is intended to evoke an error. Using #error is not portable enough. */ #error "Cannot determine unsigned 32-bit data type" #endif /* ULONG_MAX */ #endif /* USHRT_MAX */ #endif /* UINT_MAX */ #endif /* S_SPLINT_S */ /* We have to make a guess about the integer type equivalent in size to pointers which should always be correct. */ typedef unsigned long int sha_uintptr; Every semantic parser except GCC chokes on this code because it is doing so many screwy things. I'd rather use a set of tools that uses similar calls across all hashing methods in order to allow for future extensibility. In addition, BeeCrypt provides some ASM optimized implementations of algorithms, and knowing what pacman's current choke points are this is a good thing. -Dan
Every semantic parser except GCC chokes on this code because it is doing so many screwy things. I'd rather use a set of tools that uses similar calls across all hashing methods in order to allow for future extensibility. In addition, BeeCrypt provides some ASM optimized implementations of algorithms, and knowing what pacman's current choke points are this is a good thing.
Hi Just a notice about ASM stuff* :) This is sha1sum from coreutils: krics@frugal64:~/asmutils-0.18/src$ time sha1sum ./NVIDIA-Linux-x86_64-1.0-9755-pkg2.run 28fb1ac0948a583e8c760b541e19614f3f6ba0d4 ./NVIDIA-Linux-x86_64-1.0-9755-pkg2.run real 0m0.072s user 0m0.062s sys 0m0.010s And this is ./sha1sum from asmutils-0.18 <- http://asm.sourceforge.net/asmutils.html krics@frugal64:~/asmutils-0.18/src$ time ./sha1sum ./NVIDIA-Linux-x86_64-1.0-9755-pkg2.run 28fb1ac0948a583e8c760b541e19614f3f6ba0d4 ./NVIDIA-Linux-x86_64-1.0-9755-pkg2.run real 0m0.774s user 0m0.554s sys 0m0.216s :S maybe i did something wrong at asmutils compile, but seems it is not faster :S its slower. Maybe not a big and good ""benchmark"" Not posted this because of flaming or anything else. Ps.: And as i know md5driver or sha1*.c in pacman comes from coreutils source with some modification. Correct me if i'm wrong. Regards Christian Hamar alias krix Frugalware Developer Team Hungary
:S maybe i did something wrong at asmutils compile, but seems it is not faster :S its slower.
Yes, I also found that asmutils is slower. (asmutils: OS = LINUX, KERNEL = 26, OPTIMIZE = SPEED, SYSCALL = KERNEL vs. coreutils 6.9-1 standard i686 AL package) asmutils: --------- real 0m1.473s user 0m0.734s sys 0m0.302s coreutils: ---------- real 0m1.103s user 0m0.194s sys 0m0.035s Bye, Nagy Gabor
2007/4/10, Nagy Gabor <ngaba@petra.hos.u-szeged.hu>:
:S maybe i did something wrong at asmutils compile, but seems it is not faster :S its slower.
Yes, I also found that asmutils is slower. (asmutils: OS = LINUX, KERNEL = 26, OPTIMIZE = SPEED, SYSCALL = KERNEL vs. coreutils 6.9-1 standard i686 AL package)
asmutils: --------- real 0m1.473s user 0m0.734s sys 0m0.302s
coreutils: ---------- real 0m1.103s user 0m0.194s sys 0m0.035s
This means that asmutils use slower algorithm. The _same_ code written in C and assembler cannot be slower in assembler by definition. One cannot made C code faster by _just_ rewriting it in assembler, but assembler allows better processing using internal registers, and easier use of MMX/SSE/etc. Correct use of assembler (not just "let's rewrite it in asm because it's faaaast!!!") does make code faster. -- Roman Kyrylych (Роман Кирилич)
I'm an asm fan too. You are right, but we just claimed that asmutils is slower than coreutils; nothing more :-)
The _same_ code written in C and assembler cannot be slower in assembler by definition. This is true because by my definition the _same_ code runs at the same speed ;-) (Probably you mean algorithm here ;-) Bye, ngaba
2007/4/10, Nagy Gabor <ngaba@petra.hos.u-szeged.hu>:
I'm an asm fan too. You are right, but we just claimed that asmutils is slower than coreutils; nothing more :-)
The _same_ code written in C and assembler cannot be slower in assembler by definition. This is true because by my definition the _same_ code runs at the same speed ;-) (Probably you mean algorithm here ;-)
no, "cannot be slower" doesn't mean "should be faster" ;-) -- Roman Kyrylych (Роман Кирилич)
On Tue, Apr 10, 2007 at 02:09:19PM +0300, Roman Kyrylych wrote:
2007/4/10, Nagy Gabor <ngaba@petra.hos.u-szeged.hu>:
:S maybe i did something wrong at asmutils compile, but seems it is not faster :S its slower.
Yes, I also found that asmutils is slower. (asmutils: OS = LINUX, KERNEL = 26, OPTIMIZE = SPEED, SYSCALL = KERNEL vs. coreutils 6.9-1 standard i686 AL package)
asmutils: --------- real 0m1.473s user 0m0.734s sys 0m0.302s
coreutils: ---------- real 0m1.103s user 0m0.194s sys 0m0.035s
This means that asmutils use slower algorithm. The _same_ code written in C and assembler cannot be slower in assembler by definition. One cannot made C code faster by _just_ rewriting it in assembler, but assembler allows better processing using internal registers, and easier use of MMX/SSE/etc. Correct use of assembler (not just "let's rewrite it in asm because it's faaaast!!!") does make code faster.
Some people argue that compilers can do a much better job of register allocation and code optimization than people can. I believe in most non-trivial cases they're probably right. Jason
2007/4/10, Jason Chu <jason@archlinux.org>:
On Tue, Apr 10, 2007 at 02:09:19PM +0300, Roman Kyrylych wrote:
2007/4/10, Nagy Gabor <ngaba@petra.hos.u-szeged.hu>:
:S maybe i did something wrong at asmutils compile, but seems it is not faster :S its slower.
Yes, I also found that asmutils is slower. (asmutils: OS = LINUX, KERNEL = 26, OPTIMIZE = SPEED, SYSCALL = KERNEL vs. coreutils 6.9-1 standard i686 AL package)
asmutils: --------- real 0m1.473s user 0m0.734s sys 0m0.302s
coreutils: ---------- real 0m1.103s user 0m0.194s sys 0m0.035s
This means that asmutils use slower algorithm. The _same_ code written in C and assembler cannot be slower in assembler by definition. One cannot made C code faster by _just_ rewriting it in assembler, but assembler allows better processing using internal registers, and easier use of MMX/SSE/etc. Correct use of assembler (not just "let's rewrite it in asm because it's faaaast!!!") does make code faster.
Some people argue that compilers can do a much better job of register allocation and code optimization than people can. I believe in most non-trivial cases they're probably right.
"compilers can .... better ... than people can". Sure, they _can_, but it depends on programmer. I guess nobody will argue that good programmer can beat compiler automatics, because he _knows_ what he _needs_ while compiler _"quesses"_. :-P Also I think when all code is written in C but some parts in asm, in most cases authors did know what they are doing and why. Anyway back on topic. To me, Dan's idea to use beecrypt seems like a good idea. -- Roman Kyrylych (Роман Кирилич)
On 4/10/07, Jason Chu <jason@archlinux.org> wrote:
Some people argue that compilers can do a much better job of register allocation and code optimization than people can. I believe in most non-trivial cases they're probably right.
I was just going to say this. A C compiler is always better than you ("you" being proverbial here) at writing asm, unless you're talking a few hundred lines.
Na Tue, Apr 10, 2007 at 07:34:52AM -0700, Jason Chu <jason@archlinux.org> pisal(a):
Some people argue that compilers can do a much better job of register allocation and code optimization than people can. I believe in most non-trivial cases they're probably right.
:) thanks to God, there are still firms where they hire people to optimize their most important algorithms in asm :) i really don't want to hurt you but then i should assume you never got paid for such a job of course this may not be true for all archs, but it is for arm, i'm sure :) VMiklos -- developer of Frugalware Linux - http://frugalware.org
Dan McGee <dpmcgee@gmail.com> said:
We have a rather ragtag set of cryptography functions in use right now, and weird files being used (md5driver? wtf). Looking around I came across these LGPLed hash libraries that we can probably use in pacman.
BeeCrypt- http://directory.fsf.org/security/crypt/BeeCrypt.html Documentation- http://beecrypt.sourceforge.net/doxygen/c/index.html
Please forgive my ignorance, why not use OpenSSL? -Ryan
On 4/9/07, Ryan Phillips <ryan-arch@trolocsis.com> wrote:
Dan McGee <dpmcgee@gmail.com> said:
We have a rather ragtag set of cryptography functions in use right now, and weird files being used (md5driver? wtf). Looking around I came across these LGPLed hash libraries that we can probably use in pacman.
BeeCrypt- http://directory.fsf.org/security/crypt/BeeCrypt.html Documentation- http://beecrypt.sourceforge.net/doxygen/c/index.html
Please forgive my ignorance, why not use OpenSSL?
We should talk to the monotone guys and find out where it stands wrt licensing issues: http://www.venge.net/mtn-wiki/SpeedySpeedySHA1 I came across that this morning as well, I don't know why I didn't put it in my last email. -Dan
participants (8)
-
Aaron Griffin
-
Christian Hamar [krix]
-
Dan McGee
-
Jason Chu
-
Nagy Gabor
-
Roman Kyrylych
-
Ryan Phillips
-
VMiklos