Photo credit: Steve Baker (CC BY-ND 2.0)
How do you handle capacity management in your virtual environment?
If you find yourself frantically trying to keep an eye on VM density, disk space, service levels, resource utilization and – gulp – zombie VMs, then you’re not really managing capacity. It’s managing you.
We’ve written a white paper titled VM Optimization: Find the Balance to guide you through balancing between the limited resources of the data center and the unlimited desires of business users. It looks at the main areas for optimizing capacity in your virtual environment, one of which is zombie VMs.
Zombie VMs are virtual machines that have joined the walking dead. Nobody has any use for them anymore, yet they ravenously devour resources that would be better allocated to actively-used VMs.
How do VMs turn into zombies? It’s more common than you might think. Users ask you for virtual machines and work with them on a project, but then they forget to terminate them (or, if they don’t have permissions to do so, forget to inform you that they no longer need the resources). So the VMs just spin away, taking up CPU, memory, storage and network for no good reason.
Of all the IT perils lurking in your data center, zombie VMs are the most colorful. But if they get too far out of hand and you neglect to eliminate them, you could find yourself in the middle of VM Zombieland. If so, you need only watch the “Zombieland” movie for a quick refresher in Columbus’s (Jesse Eisenberg) rules for surviving the zombie apocalypse:
Columbus is too smart to risk his own life just to make himself look good. And you’re smart enough to not just start removing every VM that hasn’t been used in a while just to free up a few resources. If you blow away a VM that your CTO was counting on, she won’t be grateful that you saved the company 2GB of RAM and 100GB of disk space. There are more scientific and more accurate ways of identifying zombies than simply whacking suspicious ones.
If you see the usual spike in CPU cycles and memory use when the VM first boots up, and then a complete lack of resource usage after that, chances are it’s a zombie VM. The standard deviation of resource consumption is a good indicator; if it’s an outlier, then it’s time to see rule #2.
“When in doubt,” says Columbus, “don’t get stingy with your bullets.” I don’t think he enjoyed plugging zombies twice each or running over them an extra time in the car, but the alternative was much worse.
The SysAdmin's double-tap is not as gruesome, but it is as effective. When you find a potential zombie VM but cannot identify the owner, power it down from the console to see whether anybody complains. As a first tap, that’s a lot more compassionate than simply removing the VM.
Set yourself a reminder for 90 days later. If nobody has complained, then it’s time for the second tap. You’ll rid your environment of a zombie VM and free up resources for useful work.
When dealing with zombies, you don’t want the burden of heavy luggage. That’s why a small roller carry-on sufficed for Columbus, along with a handy, double-barreled shotgun. His road buddy Tallahassee (Woody Harrelson) traveled even lighter.
Few administrators need a cumbersome set of tools to find potential zombie VMs in their environment, and even fewer need a shotgun. Unlike real zombies – say, the ones in the photo enjoying the new Quest Community Blog on their cell phone – zombie VMs blend in more smoothly. You’re looking for a few telltale characteristics of how your VMs are consuming resources like CPU, memory, storage and network, so you should look for a tool that can tell a zombie VM from one that is just infrequently used.
Even as you relentlessly stamp out zombie VMs wherever they may lurk in your data center, you should maintain a happy and sane state of mind by keeping positive. After all: