Image of the glider from the Game of Life by John Conway
Skip to content

LZMA

I've known about the LZMA compression algorithm for a little while, but I haven't really played with it. So, giving it a quick try, I thought I would sick it after all the text files in my /etc directory. I'm using GNU tar to archive the files, and the maximum compression possible with each algorithm to get the tightest squeeze on that archive:

$ sudo tar -cf etc.tar /etc
[sudo] password for aaron: 
tar: Removing leading `/' from member names
$ time gzip -c9 etc.tar > etc.tar.gz
gzip -c9 etc.tar > etc.tar.gz  8.01s user 0.04s system 100% cpu 8.048 total
$ time bzip2 -c9 etc.tar > etc.tar.bz2
bzip2 -c9 etc.tar > etc.tar.bz2  8.12s user 0.04s system 99% cpu 8.170 total
$ time lzma -c9 etc.tar > etc.tar.lzma
lzma -c9 etc.tar > etc.tar.lzma  36.67s user 0.38s system 99% cpu 37.055 total
$ ls -lh etc.tar*
-rw-r--r-- 1 aaron aaron  37M 2008-12-14 13:52 etc.tar
-rw-r--r-- 1 aaron aaron 2.8M 2008-12-14 13:49 etc.tar.bz2
-rw-r--r-- 1 aaron aaron 4.0M 2008-12-14 13:47 etc.tar.gz
-rw-r--r-- 1 aaron aaron 1.5M 2008-12-14 13:50 etc.tar.lzma

As you can clearly see, when cranking up the compression on the TAR file, BZIP2 is comparable to GZIP. However, LZMA takes nearly 5 times as long to complete. However, the space saved from this time is significant- 1.5 MB versus 4 MB coming from GZIP. I'm not convinced 100%, though. Let's sick it after some binary data. I have another TAR file, but this time with JPEGs and AVIs from my camera. Let's see the results here (emphasis mine):

$ cd /media/NIKON/DCIM/103NIKON/
$ tar -cf ~/pics.tar *
$ cd
$ time gzip -c9 pics.tar > pics.tar.gz
gzip -c9 pics.tar > pics.tar.gz  7.18s user 0.22s system 85% cpu 8.690 total
$ time bzip2 -c9 pics.tar > pics.tar.bz2
bzip2 -c9 pics.tar > pics.tar.bz2  25.44s user 0.31s system 99% cpu 25.841 total
$ time lzma -c9 pics.tar > pics.tar.lzma
lzma -c9 pics.tar > pics.tar.lzma  68.49s user 0.82s system 99% cpu 1:09.46 total
$ ls -lh pics.tar*
-rw-r--r-- 1 aaron aaron 111M 2008-12-14 14:09 pics.tar
-rw-r--r-- 1 aaron aaron 108M 2008-12-14 14:05 pics.tar.bz2
-rw-r--r-- 1 aaron aaron 110M 2008-12-14 14:04 pics.tar.gz
-rw-r--r-- 1 aaron aaron 109M 2008-12-14 14:07 pics.tar.lzma

Yeah... LZMA isn't giving me a lot here. In fact, I find it interesting that BZIP2 won in terms of the smallest size. Now, granted, I'm already aware that JPEG and AVI files are initially compressed, so I'm not looking to gain a lot here. As already mentioned, this is mostly a quest of curiosity. Again, notice the times- over a minute to complete with LZMA, where GZIP only took 8 seconds. However, let's see what this would do on a file of nothing but binary zeros. Pulling from /dev/zero, I can create a file of any arbitrary size. So, let's create a 512 MB file, and sick the compression algorithms after it:

$ dd if=/dev/zero of=file.zero bs=512M count=1
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 12.4654 s, 43.1 MB/s
$ time gzip -c9 file.zero > file.zero.gz
gzip -c9 file.zero > file.zero.gz  4.86s user 0.18s system 99% cpu 5.052 total
$ time bzip2 -c9 file.zero > file.zero.bz2
bzip2 -c9 file.zero > file.zero.bz2  11.35s user 0.24s system 100% cpu 11.586 total
$ time lzma -c9 file.zero > file.zero.lzma
lzma -c9 file.zero > file.zero.lzma  189.81s user 0.92s system 99% cpu 3:10.73 total
$ ls -lh file.zero*
-rw-r--r-- 1 aaron aaron 512M 2008-12-14 14:14 file.zero
-rw-r--r-- 1 aaron aaron  402 2008-12-14 14:23 file.zero.bz2
-rw-r--r-- 1 aaron aaron 509K 2008-12-14 14:23 file.zero.gz
-rw-r--r-- 1 aaron aaron  75K 2008-12-14 14:27 file.zero.lzma

Heh. All I can say, is heh. BZIP2 again took the top prize for being the most compressed, getting 512 MB into a mere 402 bytes. And it only took 6 extra seconds compared to GZIP. LZMA, while compressing fairly well, did miserably in the reported time. Three minutes?! On binary zeros?! What was it doing? Watching some YouTube while doing the compression?

All in all, I'm not impressed with LZMA. It's a horrible performer, and only gives marginal results. It seems to do well on ASCII text, but fails miserably on binary files, where BZIP takes the clear win in compression. While it may pull out some impressive compression, the time it takes to perform isn't worth it. BZIP2, is a much more capable algorithm, and although it's a horrible performer too, it's not nearly as bad as LZMA. I would make it worth my while to use BZIP2, whenever possible, reaching for GZIP only with time is the primary factor.

I would be interested in some other benchmarks on different data, if anyone has access to those. I think these results give us a good idea about LZMA though- STEER CLEAR.

{ 45 } Comments

  1. Adam Petaccia using Firefox 3.0 on GNU/Linux 64 bits | December 14, 2008 at 4:06 pm | Permalink

    What about DEcompression times? bzip for example, is slower than gzip compressing, but much better going the other way. I wonder how LZMA fairs.

  2. laga using Konqueror 3.5 on Kubuntu | December 14, 2008 at 4:07 pm | Permalink

    I wonder how fast LZMA is when de-compressing. There must be an advantage somewhere..

  3. laga using Konqueror 3.5 on Kubuntu | December 14, 2008 at 4:07 pm | Permalink

    Heh :)

  4. Jeff Schroeder using Firefox 3.0.1 on Ubuntu 64 bits | December 14, 2008 at 4:10 pm | Permalink

    Here is a good one to mess with your head. With gnu tar >= 1.2.0, you can do something like tar --lzma -cf files.tar.lzma directory/. Also get ahold of 7zip and take a look at the various compression algorithms it supports.

    For text files, the ppm (http://en.wikipedia.org/wiki/Prediction_by_Partial_Matching) compresses better than anything out there. We've seen multi-gigabyte firewall logs compress down to a hundred megabytes or so with it before. Download the 7zip package (sudo apt-get install p7zip-full) and put something like this in your production backup scripts as you might not want to kill I/O on your boxes:
    ionice -c3 nice -n +20 /usr/bin/7za a -mx9 -m0=ppmd ${file}.7z $file

    Also note that lzma is optimized for fast decompression, not fast compression. Time the difference between the two and you'll see.

  5. Jeff Schroeder using Firefox 3.0.1 on Ubuntu 64 bits | December 14, 2008 at 4:13 pm | Permalink

    Forgot to mention one last thing, instead of using bzip2, use pbzip2, it is a multithreaded parallel bzip implementation that speeds up bzip considerably. Watch it or nice + ionice it though as it will chew through all available bandwidth if you let it.

  6. oleid using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 4:26 pm | Permalink

    Hello!

    First of all, the selected tests are not that good, as zeros and multimedia files aren't files one usually compresses. You could have compressed e.g. /usr/bin

    Then you should have sticked to the default compression ration for a fair time comparison, which is -7 for lzma. From the manpage:

    For relatively fast compression with medium compression ratio -1 is the recommended setting. It's faster than 'bzip2 --fast' and usually creates smaller files than 'bzip2 --best'. -2 makes somewhat smaller files but doubles the compression time close to what 'bzip2 --best' takes.

    Generally for excellent compression ratio, acceptable compression time and memory requirements (about 83 MB for compression, 9 MB for decompression) you should use -7 which is also the default. -8 and -9 will give some gain especially with bigger files (>=tens of megabytes) but also increase the CPU and memory requirements dramatically. See the table below for memory requirements of different compression settings.

    Have a nice day,
    oleid

  7. Pete using Firefox 3.0.4 on Ubuntu | December 14, 2008 at 4:47 pm | Permalink

    Who cares about compression time when you are using the "-9" flag? It's a bit of an oxymoron comparison here. I'd prefer to see the compression time and size when using the "quick" and "default" compression mode.

    Your "steer clear" verdict is useless for everyone, including yourself.

  8. Porcolino using Firefox 3.0.4 on Ubuntu | December 14, 2008 at 5:32 pm | Permalink

    I have to agree with Pete.
    These results were also surprising, as everything thrown by me at lzma gets smaller than gz/bzip2.
    One thing that never left my mind was when some guy at Ubuntu suggested to change the packaging to use lzma instead of the usual suspects. This was said a couple of years ago.

  9. Justin Dugger using Firefox 3.0.4 on Ubuntu | December 14, 2008 at 5:34 pm | Permalink

    I've found the exact opposite results in some data sets, but I may need to redo them if lzma wasn't always used in 7z. As always, understanding how the compression works is key to understanding which is appropriate for the situation. lzma allows for HUGE dictionaries working across entire directories. This means if you have lots of similar files grouped together, it will take advantage of that.

    ROMsets for example, often carry multiple versions of the same game with small localization patches. Large chunks of the binaries are identical, so lzma can effectively load the dictionary with 99 percent of the binary and describe the changes very efficiently. It does require that you organize your data in a way that exploits the similarities, or else the dictionary can't exploit the frequency of massive common strings.

    Paul Sladen has a nice in depth discussion of the things we've touched on as well.

  10. Aaron using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 5:41 pm | Permalink

    I love it when people comment without using the gray matter between their ears. Pete, check out these default times (using -6):

    $ time gzip -c etc.tar > etc.tar.gz
    gzip -c etc.tar > etc.tar.gz  1.33s user 0.02s system 93% cpu 1.442 total
    $ time bzip2 -c etc.tar > etc.tar.bz2
    bzip2 -c etc.tar > etc.tar.bz2  8.12s user 0.07s system 99% cpu 8.215 total
    $ time lzma -c etc.tar > etc.tar.lzma
    lzma -c etc.tar > etc.tar.lzma  27.45s user 0.18s system 99% cpu 27.695 total
    $ ls -lh etc.tar*
    -rw-r--r-- 1 root  root   37M 2008-12-14 17:05 etc.tar
    -rw-r--r-- 1 aaron aaron 2.8M 2008-12-14 17:06 etc.tar.bz2
    -rw-r--r-- 1 aaron aaron 4.2M 2008-12-14 17:05 etc.tar.gz
    -rw-r--r-- 1 aaron aaron 1.7M 2008-12-14 17:06 etc.tar.lzma
    

    Pretty consistent with differences in time with -9. Also, the differences in file size. Now, on my pictures, again, using the default -6:

    time gzip -c pics.tar > pics.tar.gz
    gzip -c pics.tar > pics.tar.gz  6.78s user 0.25s system 99% cpu 7.072 total
    $ time bzip2 -c pics.tar > pics.tar.bz2
    bzip2 -c pics.tar > pics.tar.bz2  25.47s user 0.27s system 99% cpu 25.827 total
    $ time lzma -c pics.tar > pics.tar.lzma
    lzma -c pics.tar > pics.tar.lzma  62.35s user 0.38s system 99% cpu 1:02.83 total
    $ ls -lh pics.tar*
    -rw-r--r-- 1 aaron aaron 111M 2008-12-14 17:13 pics.tar
    -rw-r--r-- 1 aaron aaron 108M 2008-12-14 17:14 pics.tar.bz2
    -rw-r--r-- 1 aaron aaron 110M 2008-12-14 17:13 pics.tar.gz
    -rw-r--r-- 1 aaron aaron 109M 2008-12-14 17:15 pics.tar.lzma

    What's this? Consistent, both in file size, and duration. Ok. I'll give you one more shot to redeem yourself. Let's run against the binary file of zeros:

    $ time gzip -c file.zero > file.zero.gz
    gzip -c file.zero > file.zero.gz  4.63s user 0.13s system 100% cpu 4.760 total
    $ time bzip2 -c file.zero > file.zero.bz2
    bzip2 -c file.zero > file.zero.bz2  11.36s user 0.29s system 99% cpu 11.684 total
    $ time lzma -c file.zero > file.zero.lzma
    lzma -c file.zero > file.zero.lzma  110.22s user 0.54s system 99% cpu 1:51.25 total
    $ ls -lh file.zero*
    -rw-r--r-- 1 aaron aaron 512M 2008-12-14 17:17 file.zero
    -rw-r--r-- 1 aaron aaron  402 2008-12-14 17:18 file.zero.bz2
    -rw-r--r-- 1 aaron aaron 509K 2008-12-14 17:18 file.zero.gz
    -rw-r--r-- 1 aaron aaron  75K 2008-12-14 17:21 file.zero.lzma

    The only thing to take note here, is the extra time that these algorithms are spending on sqeezing every last little bit out, isn't seen much with GZIP or BZIP2, but with LZMA, it's taking FOREVER. Which should give you yet another reason why to avoid LZMA. It just doesn't fare well.

    Next time, before placing a comment that doesn't make you sound very intelligent, I'd recommend reading the docs, and understanding how things work.

  11. Aaron using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 5:45 pm | Permalink

    LZMA fares well with compression sizes. That's not disputed here. What is disputed, is the time it takes to get to that point.

  12. Aaron using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 5:47 pm | Permalink

    Interesting post. I'll comment there here in a second, but I have a question for you: why is BZIP2 considerably smaller than LZMA on a binary file of zeros? Surely, according to your argument, it would compress that thing down to practically zip. There must be some data overhead in LZMA that doesn't exist in BZIP2.

  13. Aaron using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 6:19 pm | Permalink

    Decompression is coming up in the next post. I'll send the exact files I just compressed through decompression, and see how each fare.

  14. Roger using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 7:48 pm | Permalink

    One thing you didn't look at is resilience to corruption. Go ahead and corrupt one random byte of the compressed files and see if the decompressors even detect corruption, and if they do then how much of your data you can recover.

    Common corruption is bytes changing to all 0 or all 0xff or the insertion of \r before \n.

  15. seb using Firefox 3.0.4 on Windows XP | December 14, 2008 at 8:18 pm | Permalink

    Aaron, nice one but I think that LZMA/7z is optimised for fast decompression. So this should really go into your little test.
    Another thing to look at is corruption of the archives, what use is a highly compressed archive if you can't decompress it...

  16. Silvio using Firefox 3.0.4 on Ubuntu | December 14, 2008 at 8:37 pm | Permalink

    On your second test I don't see that difference as much a significant advantage as you did. I am also surprised that the compressed file is actually smaller than the uncompressed one for those kinds of files.

    Your last test is a scenario that would never happen and didn't throw very useful results. Any of the three formats reduced the file size more than 5000 times.

    A more real scenario would be to try with a bunch of compiled executable binaries (so that they don't contain only zeros). Also you should try 7zip compressor.

  17. Aaron using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 8:59 pm | Permalink

    7zip is a container, that contains multiple compression algorithms. It's not comparing apples to apples.

    Also, the test on /dev/zero was to see sheer speed on a file with completely identical data. Sure, it's not "real world", but I'm not after that. I'm after speed, and what would be faster than parsing exactly the same data? LZMA failed to impress.

  18. Aaron using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 9:00 pm | Permalink

    Yes- decompression is coming in a separate post.

  19. Aaron using Firefox 3.0.4 on Ubuntu 64 bits | December 14, 2008 at 9:01 pm | Permalink

    I haven't looked at corruption. That would be interesting to see. I'll see what I can come up with, and if it's worthy to put in a separate post.

  20. Justin Dugger using Firefox 3.0.4 on Ubuntu | December 14, 2008 at 9:30 pm | Permalink

    I cannot profess arcane knowledge of the lzma implementation, but my guess would be that the dictionary itself has some minimum size. Plus, bzip is designed for random binary files, so it has a lot of considerations that might always make sense. One in particular seems to be RLE, which /dev/zero has in spades.

    I would caution you against using edge cases like /dev/zero; do you regularly archive /dev/zero and friends?

  21. Justin Dugger using Firefox 3.0.4 on Ubuntu | December 14, 2008 at 9:32 pm | Permalink

    Why a separate post, when you've already published the conclusion?

  22. glandium using Epiphany 2.22 on GNU/Linux 64 bits | December 14, 2008 at 11:31 pm | Permalink

    It would be interesting to know how well they perform with binary files (taking some stuff from /bin or /usr/bin, for instance)

  23. Aigars Mahinovs using Firefox 3.0.3 on Ubuntu | December 14, 2008 at 11:50 pm | Permalink

    Images and video files are not typical binary files that are often used in compression tests. Files and JPEG images are uncompressible because they are already compressed and the only hope is to try to collate some headers or something.

    To get a real and compressible set of binary files look no further than /usr/bin or /usr/lib

    I am much more excited about the lzo compressor - it has the priority on speed.

  24. Aigars Mahinovs using Firefox 3.0.3 on Ubuntu | December 14, 2008 at 11:51 pm | Permalink

    Images and video files are not typical binary files that are often used in compression tests. Files and JPEG images are uncompressible because they are already compressed and the only hope is to try to collate some headers or something.

    To get a real and compressible set of binary files look no further than /usr/bin or /usr/lib

    I am much more excited about the lzo compressor - it has the priority on speed.

  25. Olaf Leidinger using Firefox 3.0.4 on Ubuntu 64 bits | December 15, 2008 at 2:36 am | Permalink

    When reading the lzma manpage one discovers the note, that -1 is faster and creates smaller files than bzip2.

    A quick test compressing /usr/bin

    (sorry, I don't know how to insert tables here)


    call mean time time factor size [MB] size factor
    gzip 40.61 1.00 142 1.00
    bzip2 110.97 2.73 135 0.95
    lzma 402.96 9.92 104 0.73
    lzma -1 66.62 1.64 123 0.87

    So you clearly see that if only size matters you should take lzma. If time matters you sould take gzip. If time AND size matters, you should select lzma with "-1" option.

    LZMA is said to decompress very fast. It takes 17s to decompress the lzma archive (no matter what compression used), bzip2 takes 36s. Only gzip decompresses faster with 6s. If distributors could adopt lzma compression for packages one would have smaller download sizes and faster installation (compared to bzip2 compressed archives) as decompressing is faster and reading from dvd drives would be faster.

  26. sebsauvage using Firefox 3.0.4 on Windows XP | December 15, 2008 at 3:32 am | Permalink

    I rarely compress already-compressed or totally empty files. Do you ?

    In real life, lzma outperforms gzip/bzip2 in the vast majority of cases, be it text files, databases, executables (have you tried compressing ELF or PE executables ?) and so on.

    lzma is slower ? Of course it's slower ! Better compression requires more processing power.
    There is even better than lzma: paq8hp5 outperforms lzma, but with even longer compression times (See http://prize.hutter1.net/ )

    That's no secret: That's a simple space/time trading (less space, more CPU time).
    This is exactly like the transition from MPEG2 to MPEG4 (Xvid/Divx) to H.264. Better compression requiring more CPU power.

    lzma is adequate for today computers and offers reasonable compression times with excellent compression ratios.

  27. Tom using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 4:09 am | Permalink

    Yeah Firefox should have "steered clear" shouldn't they. It was only thanks to lzma that they were able to take the FF2 installer below the magic 5mb barrier, saving something like 2mb if I remember correctly, A saving like that will have surely made a huge influence on it's uptake. Frankly "steer clear" is an absurd pronouncement to make based on such limited and unrealistic criteria.

  28. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:17 am | Permalink

    Yes- archiving /dev/zero and friends isn't all that practical. I wasn't going after that. I wanted to see speed, and I figured what could be faster than a bunch of identical data? I was way wrong.

  29. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:18 am | Permalink

    I'm willing to denounce my conclusion, if I can find where LZMA shines.

  30. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:20 am | Permalink

    Thanks for the insightful input. It's comments like these that keep me blogging.

    Tom- re-read the post. If time isn't a factor, then great! It certainly gives us great compression ratios. However, if time is a factor, BZIP2 performs fairly well, GZIP better, and they give good, not great, but good ratios.

  31. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:22 am | Permalink

    I'll fix the comment for you, putting your data in a table.

    I'm preparing another post, where we look at different types of compression, yet again, with LZMA. I'll be throwing -1 after it this time, as I'll be dealing with massively large amounts of binaries, and I just don't have all day for LZMA to do the compression. We'll see what the results are.

  32. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:22 am | Permalink

    Keep an eye on a future post.

  33. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:23 am | Permalink

    Yes, I'm aware of the fact that JPEGs and AVIs are already compressed, so I'm not expecting to see much in terms of compression. I was hoping to see something in the way of speed, however.

  34. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:39 am | Permalink

    I'll be outlining in a separate post more compression with LZMA. We'll see more results then, and whether LZMA is truly the new hotness.

  35. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 5:43 am | Permalink

    Yes. I am aware of the speed increases with -1. I was completely and totally after maximum compression, as my post outlines. I wanted to compare it to the others.

  36. Jeff Schroeder using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 6:59 am | Permalink

    @Aaron: Actually, you are incorrect. 7zip is the reference implementation of LZMA. The fact that you aren't using it says a lot about your testing methodology.

    To quote your words verbatim, "Next time, before placing a comment that doesn’t make you sound very intelligent, I’d recommend reading the docs, and understanding how things work.".

  37. Oleastre using Shiretoko 3.1b3pre on Windows XP | December 15, 2008 at 7:04 am | Permalink

    I'm wondering what lzma implementation you used.

    From the wikipedia link(and the ubuntu packaged version); I suppose you used the 7-zip/lzma utils implementation.

    But, following freshmeat news, I see a lot of updates on a utility called lzip that work kinda like gzip, but with a lzma algorithm: http://www.nongnu.org/lzip/lzip.html

    Maybe having a look at that implementation will give different results. (I should find time to do that)

  38. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 8:11 am | Permalink

    I have used 7zip before. It's a container. Straight from Wikipedia (emphasis mine):

    By default, 7-Zip creates 7z format archives, with a .7z file extension. Each archive can contain multiple directories and files. As a container format, security or size reduction are achieved using a stacked combination of filters. These can consist of pre-processors, compression algorithms and encryption filters.

    The core .7z compression stage uses a variety of algorithms, the most common of which are Bzip2 and LZMA. Developed by Igor Pavlov, LZMA is a relatively new system, making its debut as part of the 7z format. LZMA consists of a large LZ-based sliding dictionary up to 4 GiB in size, backed by a range coder.

    Try again.

  39. Aaron using Firefox 3.0.4 on Ubuntu | December 15, 2008 at 8:20 am | Permalink

    I used the the vanilla LZMA package from the Ubuntu archives. It's installed by default, as DPKG can now take advantage of it, as can RPM, GNU TAR, and others.

  40. Joerg Sonnenberger using Minefield 3.0.3 on Unknown O.S. | December 15, 2008 at 2:36 pm | Permalink

    The file.zero case for LZMA is a bug in the encoder. It is only using 257 byte per loop per backward reference, when it could easily use a single 512MB repeation. This naturally increases the compression time a lot as well.

  41. Aaron using Firefox 3.0.4 on Ubuntu | December 16, 2008 at 6:55 am | Permalink

    That makes a lot of sense, actually. Thank you. Also, I think you're the first NetBSD commenter on my blog. :)

  42. Lonnie Olson using Firefox 3.0.4 on Ubuntu | December 16, 2008 at 10:48 am | Permalink

    If it really is a bug, it is an extremely serious bug. A bug so serious that it should be reason not to use LZMA at all.

    I compress lots of files that have big sections of nothing but zeros.

  43. Joerg Sonnenberger using Minefield 3.0.3 on Unknown O.S. | December 16, 2008 at 12:48 pm | Permalink

    I agree that it is stupid. I might have been wrong on the format part from more reading of the code, but the number of long repeatitions is high on some of the data I care for. I haven't had time to investigate and measure the required changes though.

  44. Justin Dugger using Firefox 3.0.4 on Ubuntu | December 16, 2008 at 9:52 pm | Permalink

    I should correct myself, bzip is not designed for "random" binary files. Rather, it gets used for that purpose because its crazy affine transformations and compression stack seem to help out, like RLE.

  45. ArrereeTugh using Internet Explorer 6.0 on Windows XP | October 14, 2009 at 5:16 pm | Permalink

    Любопытно. Чувствуется позитив

{ 2 } Trackbacks

  1. [...] couple days ago, I covered the LZMA compression algorithm as it related to compression. Well, as pointed out in the comments, we need to see the other side [...]

  2. [...] the inaugurial post in Aaron Toponce’s series on compression, two critical errors are made and highlighted by a [...]

Post a Comment

Your email is never published nor shared.

Switch to our mobile site