A couple days ago, I covered the LZMA compression algorithm as it related to compression. Well, as pointed out in the comments, we need to see the other side of compressed data, and that is decompressing the data. I’ve kept all my data files, so, let’s decompress them with GZIP, BZIP2 and LZMA, and see how everything fares out. Also, Justin Dugger pointed out that LZMA’s strength, lies not in compressing and decompressing random data, but similar data between multiple files. So, that will be addressed in another post.
So, let’s get started decompressing. Remember, from my previous post, I took a look at ASCII data, pictures and video, then finally similar data, using /dev/zero. Let’s decompress the these files, timing them. The point of this post, and last, is not how well it compresses and decompresses, but how long it takes to perform the operation. Looking first at the ASCII data (emphasis mine):
$ time gunzip -c etc.tar.gz > etc.tar gunzip -c etc.tar.gz > etc.tar 0.24s user 0.11s system 27% cpu 1.261 total $ time bunzip2 -c etc.tar.bz2 > etc.tar bunzip2 -c etc.tar.bz2 > etc.tar 0.96s user 0.12s system 53% cpu 2.022 total $ time unlzma -c etc.tar.lzma > etc.tar unlzma -c etc.tar.lzma > etc.tar 0.36s user 0.13s system 30% cpu 1.622 total
Impressive, on ASCII text, LZMA was quite unlike the results when compressing. Let’s see how it does on random binary data:
$ time gunzip -c pics.tar.gz > pics.tar gunzip -c pics.tar.gz > pics.tar 1.36s user 0.29s system 34% cpu 4.822 total $ time bunzip2 -c pics.tar.bz2 > pics.tar bunzip2 -c pics.tar.bz2 > pics.tar 11.83s user 0.42s system 79% cpu 15.430 total $ time unlzma -c pics.tar.lzma > pics.tar unlzma -c pics.tar.lzma > pics.tar 10.71s user 0.34s system 78% cpu 14.049 total
Still outperforming BZIP2, even if it wasn’t as impressive as ASCII data. Now, the final run on our binary file of zeros:
$ time gunzip -c file.zero.gz > file.zero gunzip -c file.zero.gz > file.zero 2.77s user 1.15s system 29% cpu 13.368 total $ time bunzip2 -c file.zero.bz2 > file.zero bunzip2 -c file.zero.bz2 > file.zero 2.36s user 1.66s system 25% cpu 15.542 total $ time unlzma -c file.zero.lzma > file.zero unlzma -c file.zero.lzma > file.zero 1.74s user 1.18s system 23% cpu 12.236 total
Wow. Absolutely impressive. I have to say, that I was not expecting these times going into it. LZMA absolutely sucks in relation to time when using maximum, but beats BZIP2 in every case with decompressing. Notice how much faster it was on the binary file of zeros. With max compression, it took nearly 3 minutes to complete, but decompressing, it took only 12 seconds. Overall, you’re still better off with GZIP or BZIP2, but LZMA is holding its own with decompression.
Verdict on LZMA now? Still not my first choice, but maybe I was a bit harsh on it with the earlier post. Our benchmarks aren’t over yet, though. Let’s see how it compares with multiple files that have similar data between them. Supposedly, this is where LZMA shines.