Recent Articles

ZFS dedup: tested, found wanting

Published by Jim Salter // February 24th, 2015


Even if you have the RAM for it (and we’re talking a good 6GB or so per TB of storage), ZFS deduplication is, unfortunately, almost certainly a lose.

I don’t usually have that much RAM to spare, but one server has 192GB of RAM and only a few terabytes of storage – and it stores a lot of VM images, with obvious serious block-level duplication between images. Dedup shows at 1.35+ on all the datasets, and would be higher if one VM didn’t have a couple of terabytes of almost dup-free data on it.

That server’s been running for a few years now, and nobody using it has complained. But I was doing some maintenance on it today, splitting up VMs into their own datasets, and saw some truly abysmal performance.

root@virt0:/data/images# pv < jabberserver.qcow2 > jabber/jabberserver.qcow2 206MB 0:00:31 [7.14MB/s] [>                  ]  1% ETA 0:48:41

7MB/sec? UGH! And that’s not even a sustained average; that’s just where it happened to be when I killed the process. This server should be able to sustain MUCH better performance than that, even though it’s reading and writing from the same pool. So I checked, and saw that dedup was on:

root@virt0:~# zpool list
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
data  7.06T  2.52T  4.55T    35%  1.35x  ONLINE  -

In theory, you’d think that dedup would help tremendously with exactly this operation: copying a quiesced VM from one dataset to another on the same pool. There’s no need for a single block of data to be rewritten, just more pointers added to the metadata for the existing blocks. However, dedup looked like the obvious culprit for my performance woes here, so I disabled it and tried again:

root@virt0:/data/images# pv < jabberserver.qcow2 > jabber/jabberserver.qcow219.2GB 0:04:58 [65.7MB/s] [============>] 100%

Yep, that’s more like it.

TL;DR: ZFS dedup sounds like a great idea, but in the real world, it sucks. Even on a machine built to handle it. Even on exactly the kind of storage (a bunch of VMs with similar or identical operating systems) that seems tailor-made for it. I do not recommend its use for pretty much any conceivable workload.

(On the other hand, LZ4 compression is an unqualified win.)

  • "I honestly feel as though my business would not be where it is today were it not for us happening into the hiring of Jim Salter."

    W. Chris Clark, CPA // President // Clark Eustace Wagner, PA

  • "Jim’s advice has always been spot on – neither suggesting too little or too much. He understands that companies have limited resources and he does not offer up solutions beyond those that we have truly needed."

    Paul Yoo // President // US Patriot Tactical

  • "Jim Salter is an indispensable part of our team, maintaining our network and being proactive on all of our IT needs. We have comfort knowing that there are redundant bootable backups of all files and databases, offsite and onsite."

    Regina R. Floyd, AIA, LEED AP BD+C // Principal // Watson Tate Savory

  • Recent Thoughts

  • Demonstrating ZFS pool write distribution
  • One of my pet peeves is people talking about zfs “striping” writes across a pool. It doesn’t help any that zfs core developers use this terminology too – but it’s sloppy and not really correct. ZFS distributes writes among all the vdevs in a pool.  If your vdevs all have the same amount of free space available, this will resemble a simple striping action closely enough.  But if you have different amounts of free space on different […]