Comments on: ZFS Administration, Part III- The ZFS Intent Log https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ Linux. GNU. Freedom. Thu, 11 Jan 2018 12:06:15 +0000 hourly 1 https://wordpress.org/?v=5.0-alpha-42199 By: Nawang Lama https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-272070 Tue, 26 Sep 2017 16:12:27 +0000 http://pthree.org/?p=2592#comment-272070 Hi Aaron,
We are looking for some kind performance tunning in ZFS. So will you be able to help us to do so. If yes please mail me at nawang81@gmail.com or share me your email address.

]]>
By: Attila https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-266649 Thu, 29 Sep 2016 19:10:43 +0000 http://pthree.org/?p=2592#comment-266649 Hello,

I'd like to know what happens when - instead of a power outage - the SSD containing the SLOG breaks.
It's clear that the cached-to-be-written data is lost. Is there anything different from the power-loss scenario? Anything else than pool is "degraded"? Needed metadata missing?

Thanks in advance!

]]>
By: Sergiy Mashtaler https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-264544 Tue, 17 May 2016 20:20:54 +0000 http://pthree.org/?p=2592#comment-264544 Just a "little" tweak to make actually your ZIL to do the work you have to set sync to "enabled" otherwise it is not used and basically is a waist. I couldn't understand why alloc on my log was 0.
To enable sync execute zfs set sync=always zpoolName/poolSet

]]>
By: Anakha https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-133282 Mon, 28 Apr 2014 09:36:57 +0000 http://pthree.org/?p=2592#comment-133282 Dzezik - That is not quite how it works. How does the GC algorithm know which blocks are empty? Without TRIM (or knowledge of the filesystem in use) the only way is when a block is overwritten. GC uses TRIM to know which logical blocks were written but are now considered empty (deleted on the FS) before they're overwritten. This also plays into wear leveling because GC moves around much more invalid (written once but not marked by TRIM as deleted by the OS) data for writes that don't fill a physical "block" (a page).

These considerations along with the fact that logical blocks aren't statically mapped to physical pages (it's a dynamic mapping maintained by the SSD firmware internally on excess flash that is not user accessible) are why the author's statements on not needing TRIM and ZFS providing wear leveling aren't accurate. ZFS can not provide this function for the drive.

]]>
By: Mark https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-132080 Wed, 12 Mar 2014 10:31:19 +0000 http://pthree.org/?p=2592#comment-132080 "First and foremost, ZFS has advanced wear-leveling algorithms that will evenly wear each chip on the SSD."

If I have 240Gb SSD, and create two partitions 10GB (SLOG), 230GB (ARC). Under a heavy write scenario (ie migrating 12TB of data onto the pool), doesn't that mean that the SLOG wear will only be on 10GB section of the SSD? Or does zfs some how know the drive ssd and remap the writes over the enitre ssd

http://www.slideshare.net/dpavlin/cuc2013-zfs (page 6, last point) seems to indicate if you need 10GB of SLOG, you should create a partition of 100GB. Would you agree with this?

To minimize the wear on the ssd, is it advisable to create a ram drive for the slog during migration and then when the migration is finished, switch over to the ssd? The point being that during the migration the server is baby sat and the migration can be restarted if there is a power outage etc

]]>
By: Alberto https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-129749 Sat, 05 Oct 2013 17:42:52 +0000 http://pthree.org/?p=2592#comment-129749 I got a bit lost in here:
"It is a disk image on a GlusterFS replicated filesystem on a ZFS dataset."

Does this mean that the images (and their virtual hard-drives) are actually and physically "over" a GlusterFS, and then, over it you use ZFS??

If so..., is the ZIL disk directly connected to the virtual machine?? Or through GlusterFS as well?

]]>
By: Dzezik https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-129611 Thu, 12 Sep 2013 21:01:51 +0000 http://pthree.org/?p=2592#comment-129611 to Alan. TRIM and GC is not the same, and WL is another one. TRIM can force to reorganize some cells to clean whole blocks after OS deletes some data from drive. GC will do the same but not on OS command but in background. dont bother TRIM for new SSD with advanced GC. old SSD has poor GC and TRIM was only one way to prepare new blocks for newwritten data.

]]>
By: Aaron Toponce https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-129299 Wed, 07 Aug 2013 16:03:55 +0000 http://pthree.org/?p=2592#comment-129299 The performance gains come from the low latency of the SSD. The synchronous write is written to the SSD, which acks back quickly. Then the data is flushed from RAM to spinning platter, as you mention. So, the performance increase comes from the SSD's low latencies.

]]>
By: Ahmed Kamal https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-128412 Sun, 21 Jul 2013 23:10:33 +0000 http://pthree.org/?p=2592#comment-128412 Hi Aaron,

This series is awesome! It wasn't clear to me exactly how does SLOG device help performance that much! I mean, the data still lives in RAM and has to eventually be written to spinning rust. My wild guess is that the performance gains are from quickly ACK'ing synchronous writes, and mostly converting those mostly random writes to mostly streaming writes, reducing seeking on mechanical disk heads. However I can't be sure, looking for a confirmation from you

]]>
By: Warren Downs https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-127239 Tue, 02 Jul 2013 17:30:12 +0000 http://pthree.org/?p=2592#comment-127239 Not to belabor this point unnecessarily, but that can only be true if:

1. The SSD device supports wear leveling, or
2. ZFS has access to the whole device, so it can do the wear leveling.

My point is: Don't try to save money by using SSD devices that don't support wear leveling, if you aren't planning to give ZFS access to the whole device.

]]>
By: Aaron Toponce https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-127232 Tue, 02 Jul 2013 13:17:33 +0000 http://pthree.org/?p=2592#comment-127232 No, you don't. Again, ZFS will wear the SSD correctly. The partition will move across the chips evenly, and every chip will get the same amount of wear as the rest.

]]>
By: Warren Downs https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-127204 Mon, 01 Jul 2013 21:23:22 +0000 http://pthree.org/?p=2592#comment-127204 Makes sense-- but then even though "ZFS has advanced wear-leveling algorithms" you won't be using the ZFS feature. In other words, if you want to go cheap and depend on ZFS wear leveling, you'd need to give it the whole device to work with, not just a small portion of it.

]]>
By: Aaron Toponce https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-127203 Mon, 01 Jul 2013 18:31:15 +0000 http://pthree.org/?p=2592#comment-127203 Use an SSD that has wear leveling algorithms built into the driver, then ZFS can take advantage of them, and you won't chew through the same physical chips on the board.

]]>
By: Warren Downs https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-127201 Mon, 01 Jul 2013 18:01:26 +0000 http://pthree.org/?p=2592#comment-127201 Correct me if I'm wrong, but if you "will likely not need a large ZIL" and decide to only use a few GBs for it, wouldn't you need to calculate your life expectancy using only the small portion of disk that will be repeatedly written to SSD? This would produce a much shorter life expectancy. For your example, if 5 GB is used for 5000 write cycles you'd have only 25Tb, or less than a third of a year life expectancy.

Am I missing something?

]]>
By: Alan https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-126932 Tue, 25 Jun 2013 01:14:57 +0000 http://pthree.org/?p=2592#comment-126932 No. TRIM *is* needed, even under ZFS. It's TRIM that tells the drive what blocks are actually free, as the drive has no knowledge of what is or is not allocated/in use by the filesystem. This is unrelated to ZFS's COW operation. Many SSDs use NAND memory, which must be zeroed before data is stored. Without TRIM, the SSD won't know when the filesystem (ZFS or other) is no longer using a block, so it will have to zero out a block each time it's written to. With TRIM, the drive is aware of what blocks aren't in use, and can zero them out ahead of time instead of at the actual time of write. This is the "garbage collection" SSDs perform, and it is unrelated to any "garbage collection" that might occur at the filesystem level. Wear leveling is also important, but that's also a function of the drive and not the filesystem. Since logical blocks on a drive do not necessarily map to specific physical blocks on modern drives (e.g., spare blocks can be mapped in by the drive in a way transparent to the OS and user), "wear leveling" (i.e., trying to distribute writes physically across a device) in the filesystem is futile at best.

]]>
By: Aaron Toponce : ZFS Administration, Appendix A- Visualizing The ZFS Intent LOG (ZIL) https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-124840 Fri, 19 Apr 2013 14:15:31 +0000 http://pthree.org/?p=2592#comment-124840 [...] The ZFS Intent Log (ZIL) [...]

]]>
By: Aaron Toponce https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-124494 Wed, 20 Mar 2013 20:37:57 +0000 http://pthree.org/?p=2592#comment-124494 Yeah. I took it down, because it was thrashing my drives due to 250 MB snapshot differentials every 15 minutes. I need a better solution.

]]>
By: Michael https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-124492 Wed, 20 Mar 2013 20:14:49 +0000 http://pthree.org/?p=2592#comment-124492 Thanks for the series and the quick response with the tip. I tried the URL "http://zen.ae7.st/munin/" in the article with no response but the URL in the comment worked fine & now I am researching "http://munin-monitoring.org/"

]]>
By: Aaron Toponce https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-124481 Wed, 20 Mar 2013 15:50:40 +0000 http://pthree.org/?p=2592#comment-124481 Those graphs are from Munin. http://munin-monitoring.org/

]]>
By: Michael https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-124480 Wed, 20 Mar 2013 15:11:26 +0000 http://pthree.org/?p=2592#comment-124480 Note regarding my previous post. I am using RHEL 6 but still finding much benefit from your posts. I would like to track and graph the iowaits and such over time like you did.

]]>
By: Michael https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-124479 Wed, 20 Mar 2013 15:09:37 +0000 http://pthree.org/?p=2592#comment-124479 Aaron, I really appreciate your series. Condenses and drives home the volumes of stuff read on ZFS in plain English with examples. One question... What did you use to track/graph the disk usage over time ?

]]>
By: Aaron Toponce : ZFS Administration, Part VIII- Zpool Best Practices and Caveats https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-122835 Tue, 08 Jan 2013 04:24:56 +0000 http://pthree.org/?p=2592#comment-122835 [...] The ZFS Intent Log (ZIL) [...]

]]>
By: Aaron Toponce : ZFS Administration, Part II- RAIDZ https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-122324 Sat, 29 Dec 2012 13:22:29 +0000 http://pthree.org/?p=2592#comment-122324 [...] The ZFS Intent Log (ZIL) [...]

]]>
By: ZFS administration (3) « 0ddn1x: tricks with *nix https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-122265 Wed, 26 Dec 2012 17:16:01 +0000 http://pthree.org/?p=2592#comment-122265 [...] http://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ [...]

]]>
By: Aaron Toponce : Install ZFS on Debian GNU/Linux https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-121894 Tue, 18 Dec 2012 19:33:45 +0000 http://pthree.org/?p=2592#comment-121894 [...] The ZFS Intent Log (ZIL) [...]

]]>
By: Aaron Toponce : ZFS Administration, Part I- VDEVs https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-121816 Thu, 13 Dec 2012 13:06:08 +0000 http://pthree.org/?p=2592#comment-121816 [...] The ZFS Intent Log (ZIL) [...]

]]>
By: Aaron Toponce : ZFS Administration, Part IV- The Adjustable Replacement Cache https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-120195 Fri, 07 Dec 2012 13:00:37 +0000 http://pthree.org/?p=2592#comment-120195 [...] Our continuation in the ZFS Administration series continues with another zpool VDEV call the Adjustable Replacement Cache, or ARC for short. Our previous post discussed the ZFS Intent Log, or ZIL and the Separate Intent Logging Device, or S.... [...]

]]>
By: David E. Anderson https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/#comment-119759 Thu, 06 Dec 2012 13:59:30 +0000 http://pthree.org/?p=2592#comment-119759 Excellent series, Aaron! Very helpful...

]]>