The Difference Between Zero Blocks and Unallocated Blocks in VMDK Files

VMware vSphere introduced thin provisioning, to enable less disk space to be used for virtual machines. Previously, to quote a VMware performance study on thin provisioning, administrators needed to estimate how much storage space their virtual disks would take up to support current usage AND future growth, and pre-allocate that entire storage space. Now, thin provisioning allows virtual disks to use only the amount of storage space they currently need.

To enable this improvement, VMware now defines two different types of "empty" blocks: blocks which are unallocated in the VMDK and blocks which have zeros written into them.

Again citing the VMware performance study, zeroing is the process whereby disk blocks are overwritten with zeroes to ensure that no prior data is leaked into the new VMDK that is allocated with these blocks. Zeroing can happen at the time that a virtual disk is created (create-time), as for eager thick disks. Zeroing can also happen on the first write to a VMFS block (run-time), as for default (thick) and thin disks.

Most importantly in the context of backup, thin VMDK file blocks are not written during non-write operations like read and backup. Reading an unallocated block returns zeros -- but doesn't write those zeroes into the block. Physical storage is not allocated until a write occurs.

What This Means for CBT?

Change Block Tracking (CBT) is provided in VMware vStorage API to track the changed blocks in the VMDK file. As it turns out, CBT ALSO tracks the unallocated blocks. CBT is the only known technology for tracking unallocated blocks in the image and preventing them from having to be read during backup operations.

How this Compares to ABM?

Active Block Mapping (ABM), which is patent-pending from Vizioncore, tracks active blocks in the VMDK file. As it turns out, ABM ALSO tracks the zero blocks. ABM is the only known technology for tracking zero blocks in the image and preventing them from having to be read during backup operations.

Combine CBT and ABM for EVERY Backup Operation

It might seem as though CBT will only benefit incremental or differential backup operations. But, the truth is that CBT also improves full backup by avoiding the read of unallocated blocks from the image. So, here is a summary of how ABM and CBT work together to benefit backup operations:

  • ABM benefits the first full and continues the benefit through every subsequent incremental/differential
  • CBT turns out to benefit fulls as well as incremental/differentials, by eliminating the need to read unallocated blocks
  • ABM is unique in being able to avoid the read of zero blocks

Did You Hear that ABM Only Benefits Fulls?

That's wrong. ABM certainly benefits full backups. But, ABM continues to offer the same benefit on subsequent incremental/differential backups by keeping the same inactive data out of those backups as was kept out of the initial full.