Capacity Bottlenecks Up Ahead: Two Early Warning VM Resource Areas

Last week, I mentioned some VMWare performance metrics that should always be null. If these metrics do have a value, they show that performance problems actively exist in the virtual environment. While it can be helpful to know where to look to identify existing problems, it’s even more effective to completely prevent issues that will ruin a system administrators’ day. Described below are two VM resource areas and the metrics that report on them that can serve as early warning signals for performance issues:

Disk Latency - Latency is the amount of time it takes for a read/write request to be completed by a disk. This measure includes the time it takes for a request to go from the VM to the disk, have the disk process the request, and then send a response back from the disk to the VM. With many moving parts in this process, a bottleneck can occur either in the disk I/O throughput connecting the VM to the disk, or in the disk itself. Latency can be measured with the DAVG (Command Latency) metric which is available in ESXtop. If the latency measure is higher than 15ms for a disk, there’s a high probability that a problem might be starting to brew somewhere, and it should be immediately investigated.

Memory Ballooning – Ballooning is when a host takes non-active memory from a VM to give to another VM that is experiencing increased activity. The data that was in that memory is then swapped to disk. Further swapping can also occur if the VM that had its memory taken away begins to encounter increased activity. Memory swapping can lead to massive performance bottlenecks originating not just within the memory resource, but also in the disk and disk I/O areas. The ballooning metric, which can be found in vCenter (mem.vmmemctl.average in the vCenter API), will indicate if any VMs are experiencing ballooning (noted by VMs that have a value greater than zero). If ballooning is caught on early enough, changes can be made which would prevent possible bottlenecks resulting from the memory swapping caused by ballooning.