Image of the glider from the Game of Life by John Conway
Skip to content

LZMA Part II- Decompression

A couple days ago, I covered the LZMA compression algorithm as it related to compression. Well, as pointed out in the comments, we need to see the other side of compressed data, and that is decompressing the data. I've kept all my data files, so, let's decompress them with GZIP, BZIP2 and LZMA, and see how everything fares out. Also, Justin Dugger pointed out that LZMA's strength, lies not in compressing and decompressing random data, but similar data between multiple files. So, that will be addressed in another post.

So, let's get started decompressing. Remember, from my previous post, I took a look at ASCII data, pictures and video, then finally similar data, using /dev/zero. Let's decompress the these files, timing them. The point of this post, and last, is not how well it compresses and decompresses, but how long it takes to perform the operation. Looking first at the ASCII data (emphasis mine):

$ time gunzip -c etc.tar.gz > etc.tar
gunzip -c etc.tar.gz > etc.tar  0.24s user 0.11s system 27% cpu 1.261 total
$ time bunzip2 -c etc.tar.bz2 > etc.tar
bunzip2 -c etc.tar.bz2 > etc.tar  0.96s user 0.12s system 53% cpu 2.022 total
$ time unlzma -c etc.tar.lzma > etc.tar
unlzma -c etc.tar.lzma > etc.tar  0.36s user 0.13s system 30% cpu 1.622 total

Impressive, on ASCII text, LZMA was quite unlike the results when compressing. Let's see how it does on random binary data:

$ time gunzip -c pics.tar.gz > pics.tar
gunzip -c pics.tar.gz > pics.tar  1.36s user 0.29s system 34% cpu 4.822 total
$ time bunzip2 -c pics.tar.bz2 > pics.tar
bunzip2 -c pics.tar.bz2 > pics.tar  11.83s user 0.42s system 79% cpu 15.430 total
$ time unlzma -c pics.tar.lzma > pics.tar
unlzma -c pics.tar.lzma > pics.tar  10.71s user 0.34s system 78% cpu 14.049 total

Still outperforming BZIP2, even if it wasn't as impressive as ASCII data. Now, the final run on our binary file of zeros:

$ time gunzip -c file.zero.gz > file.zero
gunzip -c file.zero.gz > file.zero  2.77s user 1.15s system 29% cpu 13.368 total
$ time bunzip2 -c file.zero.bz2 > file.zero
bunzip2 -c file.zero.bz2 > file.zero  2.36s user 1.66s system 25% cpu 15.542 total
$ time unlzma -c file.zero.lzma > file.zero
unlzma -c file.zero.lzma > file.zero  1.74s user 1.18s system 23% cpu 12.236 total

Wow. Absolutely impressive. I have to say, that I was not expecting these times going into it. LZMA absolutely sucks in relation to time when using maximum, but beats BZIP2 in every case with decompressing. Notice how much faster it was on the binary file of zeros. With max compression, it took nearly 3 minutes to complete, but decompressing, it took only 12 seconds. Overall, you're still better off with GZIP or BZIP2, but LZMA is holding its own with decompression.

Verdict on LZMA now? Still not my first choice, but maybe I was a bit harsh on it with the earlier post. Our benchmarks aren't over yet, though. Let's see how it compares with multiple files that have similar data between them. Supposedly, this is where LZMA shines.

{ 16 } Comments