Comments on: ZFS Administration, Part VI- Scrub and Resilver https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Linux. GNU. Freedom. Wed, 13 Dec 2017 19:29:15 +0000 hourly 1 https://wordpress.org/?v=5.0-alpha-42199 By: Colbyu https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-268931 Fri, 30 Dec 2016 22:00:16 +0000 http://pthree.org/?p=2630#comment-268931 I would actually recommend, when creating the pool, to add the diskd by /dev/disk/by-id rather than using the sd_ node because that id will then be used as the disk name in the zpool status view. The id typically contains the disc's serial number, so you will see immediately which disk is bad. It's also considered generally a good practice, less ambiguous, and can alleviate an issue that can occur when importing the array on a new system.

]]>
By: remya https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-261794 Tue, 01 Mar 2016 01:54:29 +0000 http://pthree.org/?p=2630#comment-261794 Hi, Recently we faced an issue in ZFS , there was one pool was in degraded state.

We tried to switch over to another ZFS, but still the same issue. occured?

what could be the cuase

Thanks and Regards
Remya

]]>
By: santosh loke https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-238484 Sun, 26 Jul 2015 11:17:20 +0000 http://pthree.org/?p=2630#comment-238484 very well written and explained.Thanks for the great efforts!

]]>
By: JeanDo https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-229939 Tue, 24 Mar 2015 17:52:30 +0000 http://pthree.org/?p=2630#comment-229939 Better raidz3 or spares ?

I have a brand new server with 8 disks. Same model/date/size. I have three options (among plenty of others):

1. Make a raidz3 5+3 pool.
2. Make a raid 0+1 pool: "mirror sda sdb mirror sdc sdd mirror sde sdf mirror sdg sdh"
3. Make a raid 0+z1 : "raidz1 sda sdb sdc raidz1 sdd sde sdf spare sdg sdh"

The question is: what is the safer in the long run ?

I feel that with schemes 1 or 2, all disks will worn out at the same speed, and might tend to crash at about the same date.
While with scheme 3, the two disks will be in vacation until the first replacement, and will have a fresh history then...

Do you have anything in your experience pro/against that ? Any advice ?

]]>
By: Francois Scheurer https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-206891 Mon, 06 Oct 2014 09:19:34 +0000 http://pthree.org/?p=2630#comment-206891 "Unfortunately, Linux software RAID has no idea which is good or bad, and from the perspective of ext3 or ext4, it will get good data if read from the disk containing the good block, and corrupted data from the disk containing the bad block, without any control over which disk to pull the data from, and fixing the corruption. These errors are known as "silent data errors", and there is really nothing you can do about it with the standard GNU/Linux filesystem stack."

I think this is not completly correct:
Linux and ext3/4 does not store block checksums (unfortunately..), but hard disk controllers write ECC codes along the data. When reading errors occurs, they will be detected and if possible corrected.
If correction with ECC is not possible, then the linux kernel will see an unrecoverable read error and with software raid (mdadm) it will then re-read the block from another raid replica (other disk).
Then it will overwrite the bad block with the correct data, like with the zfs scrubbing, but with the advantage of doing it on demand instead of having to scrub the huge whole disks.
If the overwrite fails (write error), then it will automatically put the disk as offline and send an email alert.

We are currently looking for a similar behavior on zfs pools, because right now we are seeing read errors with zfs pool on freebsd but unfortunately the bad disls stay online until some sysadmin put them manually offline...

"However, with Linux software RAID, hardware RAID controllers, and other RAID implementations, there is no distinction between which blocks are actually live, and which aren't. So, the rebuild starts at the beginning of the disk, and does not stop until it reaches the end of the disk. Because ZFS knows about the the RAID structure and the filesystem metadata, we can be smart about rebuilding the data. Rather than wasting our time on free disk, where live blocks are not stored, we can concern ourselves with ONLY those live blocks."

Yes Linux mdadm will know nothing about fs usage and free blocks/inodes.
But a cool feature of mdadm is the bitmap that allows a resync (after a power loss for example) to only resync the modified/unsynchronized blocks of the raid.

]]>
By: Aaron Toponce https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-139956 Fri, 11 Jul 2014 15:37:46 +0000 http://pthree.org/?p=2630#comment-139956 Each node in the merkle tree is also SHA256 checksummed, all the way up to the root node. 128 Merkle tree revisions are kept, before reusing. As such, if a SHA256 checksum is corrupted, that leaf in the node is bad, and can be fixed, provided rendundancy, by looking at the checksum in the parent leaf of the tree, and rebuilding based on the redundancy in the pool.

]]>
By: Torge Kummerow https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-139887 Thu, 10 Jul 2014 19:08:32 +0000 http://pthree.org/?p=2630#comment-139887 What is the behaviour, if the corruption happens in the SHA Hash?

]]>
By: ianjo https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-129728 Sun, 29 Sep 2013 16:59:24 +0000 http://pthree.org/?p=2630#comment-129728 You state that the default checksum algorithm is sha-256, but searching on the internet I believe that no zfs implementation uses sha-256 by default. I'm using zfs 0.6.2 on linux and the manpage states:
Controls the checksum used to verify data integrity. The default value is on, which automati‐
cally selects an appropriate algorithm (currently, fletcher4, but this may change in future
releases). The value off disables integrity checking on user data. Disabling checksums is NOT a
recommended practice.

To interested readers, this can be changed with zfs set checksum=sha256 -- I do it for all my newly-created filesystems.

]]>
By: Ryan https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-129637 Sun, 15 Sep 2013 14:51:49 +0000 http://pthree.org/?p=2630#comment-129637 Wow... what does zfs not do?

Thanks for writing this blog on zfs, by the way. It's well written and easy to follow. It convinced me to switch!

]]>
By: Aaron Toponce https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-129301 Wed, 07 Aug 2013 16:11:40 +0000 http://pthree.org/?p=2630#comment-129301 First off, ZFS doesn't have any built in SMART functionality. If you want SMART, you need to install the smartmontools package for your operating system. Second, when self healing the data, due to the COW nature of the filesystem, the healed block will be in a physically different location, with the metadata and uberblock updated. ZFS does have the capability of knowing where bad blocks exist on the filesystem, and not writing to them again in the future.

]]>
By: Ryan https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-128629 Thu, 25 Jul 2013 20:24:34 +0000 http://pthree.org/?p=2630#comment-128629 Can you elaborate on what "then ZFS will fix the bad block in the mirror" entails? (Self Healing Data section)

Does this mean that ZFS relocates the data from the bad block to another block on that device? At the same time, does the SMART function of the drive mark that sector as bad and relocate it to a spare, so it isn't written to again?

]]>
By: Aaron Toponce : ZFS Administration, Appendix A- Visualizing The ZFS Intent LOG (ZIL) https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-124827 Fri, 19 Apr 2013 11:02:51 +0000 http://pthree.org/?p=2630#comment-124827 [...] Scrub and Resilver [...]

]]>
By: Aaron Toponce : ZFS Administration, Part XII- Snapshots and Clones https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-124818 Fri, 19 Apr 2013 10:58:35 +0000 http://pthree.org/?p=2630#comment-124818 [...] Scrub and Resilver [...]

]]>
By: Aaron Toponce : ZFS Administration, Part XI- Compression and Deduplication https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-124816 Fri, 19 Apr 2013 10:58:20 +0000 http://pthree.org/?p=2630#comment-124816 [...] Scrub and Resilver [...]

]]>
By: Aaron Toponce : ZFS Administration, Part I- VDEVs https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-124805 Fri, 19 Apr 2013 10:55:28 +0000 http://pthree.org/?p=2630#comment-124805 [...] Scrub and Resilver [...]

]]>
By: Aaron Toponce : Install ZFS on Debian GNU/Linux https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-124803 Fri, 19 Apr 2013 10:54:55 +0000 http://pthree.org/?p=2630#comment-124803 [...] Scrub and Resilver [...]

]]>
By: Graham Perrin https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-124284 Thu, 28 Feb 2013 07:58:20 +0000 http://pthree.org/?p=2630#comment-124284 zpool status -x

"… will return “all pools are healthy” even if one device is failed in a RAIDZ pool. In the other words, your data is healthy doesn’t mean all devices in your pool are healthy. So go with “zpool status” at any time. …"

https://diigo.com/0x79w for highlights from http://icesquare.com/wordpress/how-to-improve-zfs-performance/

]]>
By: eagle275 https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-124244 Mon, 18 Feb 2013 15:36:41 +0000 http://pthree.org/?p=2630#comment-124244 I believe he meant :

Put (self adhesive) Labels on each device , that contain Serial Number AND to which device the disk was "discovered" by linux - so you should have known "oh sde fails" - look at your array, find attached label "sde" and you have the missing disk, without trial and error

]]>
By: Aaron Toponce : ZFS Administration, Part X- Creating Filesystems https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-122833 Tue, 08 Jan 2013 04:24:25 +0000 http://pthree.org/?p=2630#comment-122833 [...] Scrub and Resilver [...]

]]>
By: Aaron Toponce : ZFS Administration, Part XV- iSCSI, NFS and Samba https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-122827 Tue, 08 Jan 2013 04:19:13 +0000 http://pthree.org/?p=2630#comment-122827 [...] Scrub and Resilver [...]

]]>
By: Aaron Toponce: ZFS Administration, Part XIV- ZVOLS - Bartle Doo https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-122066 Fri, 21 Dec 2012 18:02:37 +0000 http://pthree.org/?p=2630#comment-122066 [...] ZFS Intent Log (ZIL) The Adjustable Replacement Cache (ARC) Exporting and Importing Storage Pools Scrub and Resilver Getting and Setting Properties Best Practices and [...]

]]>
By: Aaron Toponce: ZFS Administration, Part XIII- Sending and Receiving Filesystems - Bartle Doo https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-122042 Thu, 20 Dec 2012 21:32:38 +0000 http://pthree.org/?p=2630#comment-122042 [...] ZFS Intent Log (ZIL) The Adjustable Replacement Cache (ARC) Exporting and Importing Storage Pools Scrub and Resilver Getting and Setting Properties Best Practices and [...]

]]>
By: Aaron Toponce: ZFS Administration, Part XI- Compression and Deduplication - Bartle Doo https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-121908 Tue, 18 Dec 2012 23:40:31 +0000 http://pthree.org/?p=2630#comment-121908 [...] ZFS Intent Log (ZIL) The Adjustable Replacement Cache (ARC) Exporting and Importing Storage Pools Scrub and Resilver Getting and Setting Properties Best Practices and [...]

]]>
By: Aaron Toponce : ZFS Administration, Part III- The ZFS Intent Log https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-121897 Tue, 18 Dec 2012 19:34:38 +0000 http://pthree.org/?p=2630#comment-121897 [...] Scrub and Resilver [...]

]]>
By: Aaron Toponce: ZFS Administration, Part IX- Copy-on-write - Bartle Doo https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-121835 Fri, 14 Dec 2012 18:01:08 +0000 http://pthree.org/?p=2630#comment-121835 [...] ZFS Intent Log (ZIL) The Adjustable Replacement Cache (ARC) Exporting and Importing Storage Pools Scrub and Resilver Getting and Setting Properties Best Practices and [...]

]]>
By: Aaron Toponce : ZFS Administration, Part VII- Zpool Properties https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-121748 Wed, 12 Dec 2012 13:01:21 +0000 http://pthree.org/?p=2630#comment-121748 [...] from our last post on scrubbing and resilvering data in zpools, we move on to changing properties in the [...]

]]>
By: Aaron Toponce https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-121673 Tue, 11 Dec 2012 19:45:39 +0000 http://pthree.org/?p=2630#comment-121673 Huh?

]]>
By: Anca Emanuel https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/#comment-121671 Tue, 11 Dec 2012 18:51:44 +0000 http://pthree.org/?p=2630#comment-121671 This is wrestling with the octopus ??
Conclusion: the need for simple standard administation was ignored.

]]>