Recent Articles

PSA: snapshots are better than ZVOLs.

Published by Jim Salter // June 16th, 2016


A lot of people new to ZFS, and even a lot of people not-so-new to ZFS, like to wax ecstatic about ZVOLs. But they never seem to mention the very real pitfalls ZVOLs present.

What’s a ZVOL?

Well, if you know what LVM is, a ZVOL is like an LV, but for ZFS. If you don’t know what LVM is, you can think of a ZVOL as, basically, a dynamically allocated “raw partition” inside ZFS. Unlike a normal dataset, a ZVOL doesn’t have a filesystem of its own. And you can access it by a raw devicename, like /dev/zvol/poolname/zvolname. This looks ideal for those use-cases where you want to nest a legacy filesystem underneath ZFS – for example, virtual machine images. Once you have the ZVOL, you have a raw block storage device to interact with – think mkfs.ext4 /dev/zvol/poolname/zvolname, for example – but you still get all the normal ZFS features along with it, like data integrity, compression, snapshots, and so forth. Plus you don’t have to mess with a loopback device, so that should be higher performance, right? What’s not to love?

ZVOLs perform better, though, right?

AFAICT, the increased performance is pretty much a lie. I’ve benchmarked ZVOLs pretty extensively against raw disk partitions, raw LVs, raw files, and even .qcow2 files and there really isn’t much of a performance difference to be seen. A partially-allocated ZVOL isn’t going to perform any better than a partially-allocated .qcow2 file, and a fully-allocated ZVOL isn’t going to perform any better than a fully-allocated .qcow2 file. (Raw disk partitions or LVs don’t really get any significant boost, either.)

Let’s talk about snapshots.

If snapshots aren’t one of the biggest reasons you’re using ZFS, they should be, and ZVOLs and snapshots are really, really tricky and weird. If you have a dataset that’s occupying 85% of your pool, you can snapshot that dataset any time you like. If you have a ZVOL that’s occupying 85% of your pool, you cannot snapshot it, period. This is one of those things that both tyros and vets tend to immediately balk at – I must be misunderstanding something, right? Surely it doesn’t work that way? Afraid it does.

Ooh, is it demo-in-a-VM-time again?! =)

root@xenial:~# zfs create target/dataset -o compress=off -o quota=15G
root@xenial:~# pv < /dev/zero > /target/dataset/0.bin
  15GiB 0:01:13 [10.3MiB/s] [            <=>                            ]
pv: write failed: Disk quota exceeded
root@xenial:~# zfs list
NAME             USED  AVAIL  REFER  MOUNTPOINT
target          15.3G  3.93G    19K  /target
target/dataset  15.0G      0  15.0G  /target/dataset
root@xenial:~# zfs snapshot target/dataset@1
root@xenial:~# 

Above, we created a dataset on a 20G pool, we dumped 15G of data into it, and we snapshotted the dataset. No surprises here, this is exactly what we expect.

But what happens when we try the same thing with a ZVOL?

root@xenial:~# zfs create target/zvol -V 15G -o compress=off
root@xenial:~# pv < /dev/zero > /dev/zvol/target/zvol
  15GiB 0:03:22 [57.3MiB/s] [========================================>  ] 99% ETA 0:00:00
pv: write failed: No space left on device
NAME          USED  AVAIL  REFER  MOUNTPOINT
target       15.8G  3.46G    19K  /target
target/zvol  15.5G  3.90G  15.0G  -
root@xenial:~# zfs snapshot target/zvol@1
cannot create snapshot 'target/zvol@1': out of space

Despite having 3.9G free on our pool, we can’t snapshot the zvol. If you don’t have at least as much free space in a pool as the REFER of a ZVOL on that pool, you can’t snapshot the ZVOL, period. This means for our little baby demonstration here we’d need 15G free to snapshot our 15G ZVOL. In a real-world situation with VM images, this could easily be a case where you can’t snapshot your 15TB VM image without 15 terabytes of free space available – where if you’d stuck with standard datasets, you’d be able to snapshot that same 15TB VM even with just a few hundred megabytes of AVAIL at your disposal.

TL;DR:

Think long and hard before you implement ZVOLs. Then, you know… don’t.

  • "I honestly feel as though my business would not be where it is today were it not for us happening into the hiring of Jim Salter."

    W. Chris Clark, CPA // President // Clark Eustace Wagner, PA

  • "Jim’s advice has always been spot on – neither suggesting too little or too much. He understands that companies have limited resources and he does not offer up solutions beyond those that we have truly needed."

    Paul Yoo // President // US Patriot Tactical

  • "Jim Salter is an indispensable part of our team, maintaining our network and being proactive on all of our IT needs. We have comfort knowing that there are redundant bootable backups of all files and databases, offsite and onsite."

    Regina R. Floyd, AIA, LEED AP BD+C // Principal // Watson Tate Savory

  • Recent Thoughts

  • Demonstrating ZFS pool write distribution
  • One of my pet peeves is people talking about zfs “striping” writes across a pool. It doesn’t help any that zfs core developers use this terminology too – but it’s sloppy and not really correct. ZFS distributes writes among all the vdevs in a pool.  If your vdevs all have the same amount of free space available, this will resemble a simple striping action closely enough.  But if you have different amounts of free space on different […]