Comments on: ZFS Administration, Part XI- Compression and Deduplication https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Linux. GNU. Freedom. Wed, 13 Dec 2017 19:29:15 +0000 hourly 1 https://wordpress.org/?v=5.0-alpha-42199 By: Richard Geary https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-266421 Sat, 10 Sep 2016 14:06:23 +0000 http://pthree.org/?p=2878#comment-266421 > "In this case, the data is being compressed first, then deduplicated. The raw data would normally occupy about 66 MB of disk, however it's only occupying 37 MB, due to compression and deduplication."

You're only displaying the compressed size, not the dedup + compressed size. Compression ratio is 1.74, so 37Mb = 66Mb / 1.74.
Since dedup is applied to the pool, I presume you can't see the on-disk size unless you look at zpool status (otherwise you'd get duplicate counting)

]]>
By: James Lin https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-261496 Wed, 17 Feb 2016 21:34:50 +0000 http://pthree.org/?p=2878#comment-261496 In your example of compression, you turned on compression for tank/log:

# zfs create tank/log
# zfs set compression=lz4 tank/log

and later you use another destination path as compression example /tank/test:

# tar -cf /tank/test/text.tar /var/log/ /etc/
# ls -lh /tank/test/text.tar
-rw-rw-r-- 1 root root 24M Dec 17 21:24 /tank/test/text.tar
# zfs list tank/test
NAME USED AVAIL REFER MOUNTPOINT
tank/test 11.1M 2.91G 11.1M /tank/test
# zfs get compressratio tank/test
NAME PROPERTY VALUE SOURCE
tank/test compressratio 2.14x -

Why is that the case? Is it typo?

]]>
By: Aaron Toponce https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-228504 Tue, 03 Mar 2015 19:07:27 +0000 http://pthree.org/?p=2878#comment-228504 Data in the dataset is deduplicated. The data is matched against all the data in the pool, which includes data outside of that dataset. However, data in other datasets is not deduplicated. In other words, deduplication is handled per dataset, but the data that it's being deduplicated against can be any block on the pool, in or out of the dataset itself.

]]>
By: Cliff Lawton https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-228469 Mon, 02 Mar 2015 07:22:51 +0000 http://pthree.org/?p=2878#comment-228469 Hi, I am confused how enabling dedupe on a zfs dataset effects the zpool. Are you able to explain more clearly the difference of these two sentences? Thanks
"realize that even though the "dedup" property is enabled on a dataset, it deduplicates against ALL data in the entire storage pool. Only data committed to that dataset will be checked for duplicate blocks."

]]>
By: James Mills https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-219509 Wed, 10 Dec 2014 12:06:34 +0000 http://pthree.org/?p=2878#comment-219509 I get the following error with zdb:

# zdb -b tank
zdb: can't open 'tank': No such file or directory

]]>
By: grin https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-131228 Mon, 16 Dec 2013 21:04:23 +0000 http://pthree.org/?p=2878#comment-131228 Here's a simple test about compression since I was curious:

Original file is a pretty verbose log form an ip phone:
-rw-r----- 1 root adm 551M Dec 16 22:01 /var/log/gxp2000.log

The filesystem:
NAME USED AVAIL REFER MOUNTPOINT
tank/watch/gzip 55.5M 16.8G 55.5M /tank/pleigh/gzip
tank/watch/gzip-9 51.2M 16.8G 51.2M /tank/pleigh/gzip-9
tank/watch/lz4 76.1M 16.8G 76.1M /tank/pleigh/lz4
tank/watch/lzjb 134M 16.8G 134M /tank/pleigh/lzjb
tank/watch/on 134M 16.8G 134M /tank/pleigh/on
tank/watch/zle 576M 16.8G 576M /tank/pleigh/zle

And compressing with gzip -9 the file:
-rw-r----- 1 root root 28M Dec 16 21:56 zle/gxp2000.log.gz

Funny thing is about 'zfs list' is that after compressing the file in the zle dir (which compressed none) the USED stayed at 576M while 'du' sees through the veil:
# du -h zle/
28M zle/

]]>
By: ianjo https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-129730 Sun, 29 Sep 2013 17:40:14 +0000 http://pthree.org/?p=2878#comment-129730 Just two small notes on compression:
- the option for gzip is gzip-[1-9] (for example 'gzip-9')
- there is a new compression algorithm in most open-source zfs implementations, 'lz4', that is intended as a faster and smarter replacement for lzjb in exactly your use case of "a little compression by default never hurt anyone"

]]>
By: Magni https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-129725 Sat, 28 Sep 2013 00:24:01 +0000 http://pthree.org/?p=2878#comment-129725 And how you can set blocksize to 1MB? As far as I understand ZFS for Linux you can only set size in range 512B - 128KB and not more?
Do I miss something here (I'm pretty new to ZFS so it's possible 🙂

]]>
By: Dzezik https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-129609 Thu, 12 Sep 2013 20:06:42 +0000 http://pthree.org/?p=2878#comment-129609 but if my files on filesystem are in megabytes (i.e MP3, JPG...) then I can increase block size from 128kb to 1mb and then I need about 8 time less memory for the same deduplication. 640MB for 1TB, for about 12TB storage is then less than 8GB.
other documents on separate dataset without dedup but with compression. compression is useless for jpg mp3, flac, avi, mpg, mkv. 8GB is not a problem, You can build 32GB RAM machine on any CPU/motherboard.
I decided to use single xeon lga 2011 and get 128GB to build some VM machines with databases on the same hardware, ZFS will be storage appliance for databases.

]]>
By: Aaron Toponce : ZFS Administration, Appendix B- Using USB Drives https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-127473 Tue, 09 Jul 2013 04:08:47 +0000 http://pthree.org/?p=2878#comment-127473 […] Compression and Deduplication […]

]]>
By: Aaron Toponce : ZFS Administration, Part XIV- ZVOLS https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-124822 Fri, 19 Apr 2013 10:59:38 +0000 http://pthree.org/?p=2878#comment-124822 [...] Compression and Deduplication [...]

]]>
By: Aaron Toponce : Install ZFS on Debian GNU/Linux https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-124804 Fri, 19 Apr 2013 10:55:10 +0000 http://pthree.org/?p=2878#comment-124804 [...] Compression and Deduplication [...]

]]>
By: Michael https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-124489 Wed, 20 Mar 2013 19:07:04 +0000 http://pthree.org/?p=2878#comment-124489 Aaron, Thanks for the series. I am also using zfsonlinux but am using rhel6 (actually oel6) instead of debian.

I was at first getting nothing on the following command and thought maybe it was because of a difference in the distributions or something:

"zdb -b rpool"

I then thought maybe I should replace rpool with my pools name (tank) and that worked... Just a thought but in your earlier posts I thought you were using tank and other sites use tank so maybe it would be better for consistency's sake to change the example to:

"zdb -b tank"

It may not be necessary but that would have made it more clear to me and maybe will help someone else...

Thanks again. Great writing & analysis !

]]>
By: Aaron Toponce : ZFS Administration, Part I- VDEVs https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-122846 Tue, 08 Jan 2013 04:27:58 +0000 http://pthree.org/?p=2878#comment-122846 [...] Compression and Deduplication [...]

]]>
By: Aaron Toponce : ZFS Administration, Part III- The ZFS Intent Log https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-122841 Tue, 08 Jan 2013 04:26:31 +0000 http://pthree.org/?p=2878#comment-122841 [...] Compression and Deduplication [...]

]]>
By: Aaron Toponce : ZFS Administration, Part VI- Scrub and Resilver https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-122837 Tue, 08 Jan 2013 04:25:28 +0000 http://pthree.org/?p=2878#comment-122837 [...] Compression and Deduplication [...]

]]>
By: Aaron Toponce: ZFS Administration, Part XIV- ZVOLS - Bartle Doo https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-122067 Fri, 21 Dec 2012 18:02:54 +0000 http://pthree.org/?p=2878#comment-122067 [...] Creating Filesystems Compression and Deduplication Snapshots and Clones Sending and Receiving Filesystems ZVOLs What is a [...]

]]>
By: Aaron Toponce: ZFS Administration, Part XIII- Sending and Receiving Filesystems - Bartle Doo https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-122043 Thu, 20 Dec 2012 21:32:54 +0000 http://pthree.org/?p=2878#comment-122043 [...] Creating Filesystems Compression and Deduplication Snapshots and Clones Sending and Receiving [...]

]]>
By: Aaron Toponce : ZFS Administration, Part X- Creating Filesystems https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-122024 Thu, 20 Dec 2012 15:07:05 +0000 http://pthree.org/?p=2878#comment-122024 [...] Compression and Deduplication [...]

]]>
By: Aaron Toponce: ZFS Administration, Part XI- Compression and Deduplication - Bartle Doo https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-121909 Tue, 18 Dec 2012 23:40:48 +0000 http://pthree.org/?p=2878#comment-121909 [...] Copy-on-write Creating Filesystems Compression and Deduplication [...]

]]>
By: Aaron Toponce : ZFS Administration, Part VII- Zpool Properties https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-121902 Tue, 18 Dec 2012 19:35:53 +0000 http://pthree.org/?p=2878#comment-121902 [...] Compression and Deduplication [...]

]]>
By: Aaron Toponce : ZFS Administration, Part VIII- Zpool Best Practices and Caveats https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/#comment-121900 Tue, 18 Dec 2012 19:35:23 +0000 http://pthree.org/?p=2878#comment-121900 [...] Compression and Deduplication [...]

]]>