4 Lesser Known (But Very Important) Reasons to Correctly Size vCPU Allocations

We’ve talked previously about how CPU Ready is something you want to monitor in your environment and how the best way to prevent/cure CPU Ready issues is to properly allocate vCPUs to VMs. As I’m reading the updated vSphere 5.1 Performance Best Practices; however, I’m finding some other good reasons to allocate vCPUs correctly.

These 4 (lesser known) reasons to correctly size vCPU allocations were also listed in previous iterations of the document, but they are worth a refresher.

  1. Even if the guest operating system doesn’t use some of its vCPUs, configuring virtual machines with those vCPUs still imposes some small resource requirements on ESXi that translate to real CPU consumption on the host. For example: Unused vCPUs still consume timer interrupts in some guest operating systems. (Though this is not true with “tickless timer” kernels, described in “Guest Operating System CPU Considerations” on page 41.)
  2. Maintaining a consistent memory view among multiple vCPUs can consume additional resources, both in the guest operating system and in ESXi. (Though hardware-assisted MMU virtualization significantly reduces this cost.)
  3. Most guest operating systems execute an idle loop during periods of inactivity. Within this loop, most of these guest operating systems halt by executing the HLT or MWAIT instructions. Some older guest operating systems (including Windows 2000 (with certain HALs), Solaris 8 and 9, and MS-DOS), however, use busy-waiting within their idle loops. This results in the consumption of resources that might otherwise be available for other uses (other virtual machines, the VMkernel, and so on).

    ESXi automatically detects these loops and de-schedules the idle vCPU. Though this reduces the CPU overhead, it can also reduce the performance of some I/O-heavy workloads. For additional information see VMware KB articles 1077 and 2231.
  4. The guest operating system’s scheduler might migrate a single-threaded workload amongst multiple vCPUs, thereby losing cache locality.

Sharing is Caring

As usual, I hope this has been beneficial and that your world is now a better place. If you have other thoughts or comments that will benefit the community, please post them below.