[arch-general] Btrfs more than twice as fast compared to ext4

Shridhar Daithankar ghodechhap at ghodechhap.net
Tue Mar 16 01:48:23 CET 2010


On Monday 15 March 2010 15:44:35 Nathan Wayde wrote:
> On 13/03/10 03:05, Shridhar Daithankar wrote:
> > Hi,
> > 
> > Just wanted to share an interesting experience I had today.
> > 
> > Check http://ghodechhap.net/btrfs.performance.txt
> 
> Maybe you're looking for http://docs.python.org/library/filecmp.html
> 
> One cannot help but think that you took a disk-bound process and turned
> it into a cpu-bound one. Since you're just interested in which files are
> different you should have just used `cmp` instead of `md5sum`
> the latter is just overkill and I'd assume calling an external command
> that many times can't be very nice either.
> 
> here are some comparisons, they use /usr/lib - i figured 75000 files
> should be a good test... I made this as deliberately
> unfair/in-comparable as possible, I wanted to show the potential
> overhead of calling md5sum that many times.

I didn't know of cmp, thanks. I tried the same thing with cmp in loops and it 
agrees with your comments that it is is totally I/O bound, not CPU bound at 
all. 

However, even in md5sum case, I/O was high too, the disk light was on all the 
time. May be it was the case for CPU speed difference.

But as far as file system performance goes, the overhead should be identical 
for both the runs, no?

Besides, I need to run the comparison(rather verification of file contents) 
many times over during the application life-cycle and I cannot afford to bring 
in another copy from disk. The working set is expected to be 30-40GB at a 
time, 3GB is just test setup.

With md5sum, I can store it in database and verify it on one copy only.

And finally, it is terrible on timings. Running md5sum is lot faster, about 3 
times in the best case.

shridhar at bheem /mnt1/shridhar/tmp/importtest.big$ time for i in `find . -type 
f`;do cmp "$i" "/data/shridhar/tmp/4/$i";done

real    21m30.137s
user    0m27.665s
sys     1m21.581s
shridhar at bheem /data/shridhar/tmp/4$ time for i in `find . -type f`;do cmp 
"$i" "/mnt1/shridhar/tmp/importtest.big/$i";done

real    6m26.988s
user    0m40.721s
sys     1m28.371s
shridhar at bheem /mnt1/shridhar/tmp/importtest.big$ time for i in `find . -type 
f`;do cmp "$i" "/data/shridhar/tmp/4/$i";done

real    16m27.541s
user    0m37.281s
sys     1m23.995s

So when the source file system is btrfs, it is still couple of times faster at 
least.
-- 
Regards 
 Shridhar


More information about the arch-general mailing list