Before you Migrate that VM to the Cloud

 When the day arrives that your IT leadership decides to incorporate modern cloud technologies into operations some new realities will come into play.

First you need to put cloud computing into it’s proper perspective which makes the process for migration easier to define and execute.  Cloud infrastructure is quite simply a collection of software defined computing resources managed and hosted by someone outside your organization.  Whether you choose AWS, Azure, Google, RackSpace, SoftLayer, Virtustream or any number of public/private cloud services you have to follow a recipe for migrating systems, services and applications off premises.

This post will address some of the issues to consider during the migration of workloads to Infrastructure as a Service (IaaS) type cloud providers.  In addition to points discussed you will still need to address the following short list of action items:

  • Governance, risk, compliance (GRC) with regard to security
  • Replication/synchronization (if hybrid)
  • Storage tiers
  • Data protection
  • Disaster recovery i.e. failover (no it is not automatic in a IaaS cloud)
  • Database migration
  • and more

In choosing to deploy into any IaaS service you, the operator, will be responsible for any systems tuning toward meeting organizational SLA commitments. Thus consumption and performance baselining will need to be conducted prior to migrating any workload to an IaaS cloud.  Begin with the basic resource pillars of compute, disk, memory and networking metrics for each workload. Given that a given application or service could have multiple servers the term workload refers to metrics in aggregate across several hosts or VMs.

Understanding individual host (server or VM) consumption baseline (min,max and average) metrics will give you more than ample data to gauge performance expectations when executed in a cloud environment.  Resist the temptation to use a providers VM import tool which is simply a lift and shift process.  Without a clear picture of consumption and dependencies on a physical or virtual host you invariably end up with sticker shock at the close of the billing cycle.

Performing system and application tuning before will also identify what OS resources are an absolute requirement.  Think of all those unnecessary processes/daemons currently running in the background.  On a depreciating server asset the cost may be sunk but in a cloud those excess compute cycles have a real dollar cost. Start with a performance monitoring tool or OS level utility for one off investigations of host performance.  If you do opt for an OS level tool be sure it supports a logging or historical option so metrics can be collected over time.  With OS level tools also be prepared to push data into Excel or Google Sheets for metric analysis and comparisons before and after cloud deployment.

Any current over subscription of a given resource (cpu, disk, memory, network) by a workload in your current on premises environment will only be exacerbated when executed on a IaaS cloud. Isolating then correlating which processes are responsible for specific consumption metrics will be the key to optimizing a host/vm for cloud operations.

Remember the more granular you are able to examine performance baselines at the Guest OS level, the end result will be a more efficient machine image (AMI, VM etc).  Pay particular attention to when spikes occur since when taken in aggregate for all workloads they will provide an indicator of for any potential auto scaling requirements.  Assuming the application can support auto scaling, i.e. IIS/Apache/NGinx web servers with support for DNS round robin style load balancing.  Some applications can support auto scaling with some customization at the operations level but others will require customer development efforts or an upgrade from the ISV.

You will be tempted to simply employ a lift and shift tactic but be careful when using a ‘vm import’ tool to a cloud provider.  Importing production VM images (vmdk, vhd) that are not optimized for operations in a pure IaaS environment would result in hire usage costs to your end users.  Optimization of physical and virtual workloads prior to migration is critical if you plan to achieve any resource consumption cost savings over time.  And optimization of VMs is required to insure the performance SLAs committed to the business are met consistently.  For instance you should pay particular attention to workloads currently assigned multiple vCPUs and whether they are actually utilized. When Provisioning a machine image or VM on an IaaS cloud your cost will be directly proportional to the amount resources allocated. Be prepared to justify reductions in vCPU, memory and disk allocations to an application's owner with actual usage metrics.  

Since the storage layer constitutes the majority of performance bottlenecks in a shared resource environment a detailed examination of disk latency should be conducted.  For a windows guestOS you can start with the perfmon tool or OpsMgr if you own System Center. For VM disk statistics try vscisstats in VMware for gathering data specific on virtual disk performance versus VM to LUN. The VM’s vhd will more closely reflect the underlying storage architecture and associated latencies provided by the IaaS cloud. Start with the quick cheat sheet below:

  • VMware disk IO: try vscsistats, PowerCLI
  • Linux: sar, sysstat
  • Windows: Resource Monitor, MMC, PowerShell or a utility from sysinternals

For larger migrations involving several dozen or more machines consider an enterprise class tool such as Foglight Storage Management to perform metric collection and baseline analysis including detailed VM to LUN disk dependencies and consumption for all host/VM based workloads.

Disk IO will be a particularly sensitive issue for database admins managing MySQL, SQL Server or Oracle instances on premises. Ingesting data from native database servers calls for a decision on whether to migrate the VM/host running the DB instance or use a cloud native DB service such as AWS SimpleDB, Amazon RDS for Oracle or SQL Server or SQL Azure.  

You may have noticed that we did not address migrating specific application server types such as Active Directory, Exchange/UC, SharePoint etc. Each of these workloads and others already have software as a service (SaaS) options available from Microsoft, Amazon and other cloud providers.  There is also an exhaustive amount of documentation geared toward virtualizing and running tier 1 workloads in public and private cloud infrastructures. Carefully consider the cost and level of effort required to migrate user built machine images versus using the software as a service (SaaS) option.

Before committing fully to a specific providers consider running both optimized and non-optimized workloads on your top 3 selections for IaaS cloud vendors. Monitoring your usage, performance and cost over a minimum of 30 days or over your peak period/season.  These rudimentary steps will insure that you have the correct data points to not only set and meet expectations but also enable you to better negotiate and manage IaaS consumption over a specified period of performance.  So here’s to you all coming in on time and under budget.

In a future post we will tackle the issues associated with managing the operations for private cloud infrastructure.

 Learn more about Quest solutions