Capacity Planning in VM Zombieland? You’re Gonna Need a Few Rules. [white paper]

Photo credit: Steve Baker (CC BY-ND 2.0)

How do you handle capacity management in your virtual environment?

If you find yourself frantically trying to keep an eye on VM density, disk space, service levels, resource utilization and – gulp – zombie VMs, then you’re not really managing capacity. It’s managing you.

We’ve written a white paper titled VM Optimization: Find the Balance to guide you through balancing between the limited resources of the data center and the unlimited desires of business users. It looks at the main areas for optimizing capacity in your virtual environment, one of which is zombie VMs.

What are Zombie VMs, and how do I handle them?

Zombie VMs are virtual machines that have joined the walking dead. Nobody has any use for them anymore, yet they ravenously devour resources that would be better allocated to actively-used VMs.

How do VMs turn into zombies?   It’s more common than you might think. Users ask you for virtual machines and work with them on a project, but then they forget to terminate them (or, if they don’t have permissions to do so, forget to inform you that they no longer need the resources). So the VMs just spin away, taking up CPU, memory, storage and network for no good reason.

Of all the IT perils lurking in your data center, zombie VMs are the most colorful. But if they get too far out of hand and you neglect to eliminate them, you could find yourself in the middle of VM Zombieland. If so, you need only watch the “Zombieland” movie for a quick refresher in Columbus’s (Jesse Eisenberg) rules for surviving the zombie apocalypse:

Rule #17: Don’t be a hero.

Columbus is too smart to risk his own life just to make himself look good. And you’re smart enough to not just start removing  every VM that hasn’t been used in a while just to free up a few resources. If you blow away a VM that your CTO was counting on, she won’t be grateful that you saved the company 2GB of RAM and 100GB of disk space. There are more scientific and more accurate ways of identifying zombies than simply whacking suspicious ones.

If you see the usual spike in CPU cycles and memory use when the VM first boots up, and then a complete lack of resource usage after that, chances are it’s a zombie VM. The standard deviation of resource consumption is a good indicator; if it’s an outlier, then it’s time to see rule #2.

Rule #2: Double-tap.

“When in doubt,” says Columbus, “don’t get stingy with your bullets.” I don’t think he enjoyed plugging zombies twice each or running over them an extra time in the car, but the alternative was much worse.

The SysAdmin's double-tap is not as gruesome, but it is as effective. When you find a potential zombie VM but cannot identify the owner, power it down from the console to see whether anybody complains. As a first tap, that’s a lot more compassionate than simply removing the VM.

Set yourself a reminder for 90 days later. If nobody has complained, then it’s time for the second tap. You’ll rid your environment of a zombie VM and free up resources for useful work.

Rule #7: Travel light.

When dealing with zombies, you don’t want the burden of heavy luggage. That’s why a small roller carry-on sufficed for Columbus, along with a handy, double-barreled shotgun. His road buddy Tallahassee (Woody Harrelson) traveled even lighter.

Few administrators need a cumbersome set of tools to find potential zombie VMs in their environment, and even fewer need a shotgun. Unlike real zombies – say, the ones in the photo enjoying the new Quest Community Blog on their cell phone – zombie VMs blend in more smoothly. You’re looking for a few telltale characteristics of how your VMs are consuming resources like CPU, memory, storage and network, so you should look for a tool that can tell a zombie VM from one that is just infrequently used.

Rule #32: Enjoy the little things.

Even as you relentlessly stamp out zombie VMs wherever they may lurk in your data center, you should maintain a happy and sane state of mind by keeping positive. After all:

  • You’re out to get the zombie VMs. They’re not out to get you.
  • Unlike the more traditional zombies, zombie VMs aren’t infected and don’t want to infect their friends and neighbors. They don’t bite.
  • You can get Twinkies anytime you want (I know people who thought the temporary suspension of sales a few years ago was one of the signs of the coming apocalypse – with or without zombies).
  • Occasionally, a business user will actually send you an email thanking you for spinning up the VM and informing you that they don’t need it anymore. That’s a little thing worth enjoying, and it saves you a double-tap.
  • Zombie VMs are a pain, but at least they’re not gross.
  • We have a marvelous white paper you can read, VM Optimization: Find the Balance, for more insights on right-sizing and reducing wasted resources in your data center.
  • “Six people left in the world and one of them is Bill *** Murray!” That sure made Tallahassee’s day, so why not yours too?