Capacity management in your virtual environment is a balancing act.
Take the two athletes in the photo, for example. The one on the left can represent the unlimited desires of the business – the workload on the virtual machine – and the one on right is the limited resources you have - the available capacity on your ESX host. And you’re the one who has to make sure things stay in balance.
Virtualization was supposed to remove the limits to your resources, enabling you to meet business requirements more readily, but, as the business desires have grown at the same pace as the technology, you now just have different limits. Should you under-allocate to live within your budget or over-allocate to keep business managers off your back?
We’ve released a new paper called VM Optimization: Find the Balance. It contains insights on the two most common measures for balancing user demands and IT resources in the virtual environment: right-sizing and waste-finding. I’ll explore those two measures in this post.
Right-sizing in the virtual environment
Business managers may know that they need a VM, but they may not know what size they need. Right-sizing means optimizing the amount of CPU, memory and disk space allocated. It requires that you create the VM, then track usage.
It takes time to find the balance between utilization and allocation, and it’s a computing paradigm unique to virtual machines and storage arrays. You don’t right-size an exec’s laptop by giving him less memory or disk space because he’s not using very much. Yet one of the advantages of virtualization is that you can do just that, in a way: you can allocate more or fewer resources to a VM based on usage over time.
Mind you, exceptions abound, so you can’t right-size every VM. Suppose Finance requests a VM with 4 CPUs, 12GB of memory and 800GB of disk space. You create it for them, then track utilization for 10 weeks.
“You’re using only a fraction of those resources,” you tell the controller. “We need to free them up for other groups.”
“Just wait until next week when the quarter ends,” she replies.
Sure enough, huge reports pour in from all over the company and utilization of CPU, memory and disk space spikes for quarter close.
“Told you,” says the controller when she sees you in the hall the following week.
You may not be able to accomplish all of your optimization goals through right-sizing, so reducing waste is another approach to finding the balance in VM optimization.
Waste-finding in the virtual environment
Excess disk space and virtual machines that are not doing any work represent waste. The VMs may be up and running, but if nobody is using them for useful work, then SysAdmins are justified in enforcing policies for freeing up the resources. Here are some examples:
- Abandoned VM images – You can tell from the time stamp on the .vmdk image file how recently the VM has been touched. It’s up to you to establish a policy for finding and deleting abandoned images after a certain period of inactivity. If you cannot identify or contact the owner, then archive the image elsewhere and see whether it is missed. If not, then it’s time to free up the resources by deleting the image.
- Powered-off VMs – “It’s not abandoned. We just haven’t booted it up lately.” You may hear that from owners who are not ready to say goodbye to the VM, usually because they haven’t thought about it for a long time. They tell you it’s not abandoned, and you tell them that it’s not OK to take up resources needlessly forever. If you identify a virtual machine that has been powered off for three or four months, it's a safe bet nobody's using it.
- Unused template images – These are like your old financial records: you don’t like keeping them around, but it would take so much work to recreate them that you’d rather not risk discarding them. Or, not wanting to clean out your email box, because you *might* need to refer back to an email from 3 years ago….even though you haven’t needed to for the past two and a half (I swear I’m not a hoarder…I just like to be prepared!). Still, don’t you feel silly holding on to a Windows XP template image with Office 2003 and Internet Explorer 6? Set a policy for how long you’ll support the OS and applications in the template and apply it to everyone. Even yourself.
- Snapshots – Similarly, people hang on to snapshots because of all the work that would go into recreating them. A snapshot could represent months or years of accumulated updates, patches and fixes, but like any backup policy, there’s a practical limit to their useful age.
- Zombie VMs – These boot up and run but perform no useful work because everybody has forgotten that they’re there. See my previous post about the VM Zombieland that accumulates over time in your data center.
VM Optimization: Find the Balance
You’ve chosen a virtualization strategy because you wanted a better way to keep your users happy and productive with their computing resources. With that comes the ongoing need to find the balance between allocation and utilization.
In our paper, VM Optimization: Find the Balance, gymnastically inclined SysAdmins will find perspectives on areas of capacity optimization including VM density, disk space, zombie VMs, service levels, and resource utilization.