Hacker News new | past | comments | ask | show | jobs | submit login

If you are looking for a fast/realtime lossless compression algorithm, you should prefer lz4 [1] or lzo [2].

Nevertheless, an article comparing the speed and compression efficiency of these algorithms could be interesting.

[1] https://code.google.com/p/lz4/

[2] http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Ober...




There's a link to a table of benchmarks in the description (http://mattmahoney.net/dc/text.html). LZ4 still looks a little more compelling than LZHAM in terms of compress/decompress time. It doesn't mention the specific application though (CPU/Disk/etc...).


Are you sure it doesn't compare favorably to LZ4? For a small (1.5x) slower decompression speed, the compression factor appears to be substantially better. LZHAM managed to compress the enwiki8 set to 23.80 MiB, and the enwiki9 set to 196.83 MiB.

LZ4, by comparison, had compressed sizes of 40.88 MiB and 362.40 MiB. A database or a Linux distribution using zswap/zram/etc. would be able to store 84% more data in the same amount of memory. For caches, that's huge.

Here are the table rows:

    lzham alpha 3 x64 -m4 -d29                    24,954,329  206,393,809    155,282 x  206,549,091    595     9 4800 LZ77 45 
    lz4 v1.2          -c2                         42,870,164  379,999,522     49,128 x  380,048,650     91     6   20 LZ77 26
One trade-off appears to be that lzham often requires a larger dictionary in memory, but it appears that even with smaller dictionary sizes it is appreciably more compact than lz4.


That was also one of the things I noticed at a quick glance. It would probably be a fairly big win for anything I/O constrained. I didn't take a look at the data being compressed or the algorithm in detail though.


Those are very very large datasets compared with most typical uses of lzo




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: