United States-English

App Manageability blog

Performance characteristics of virtualization led server consolidation

Published 24 August 2007, 06:10 PM

Virtualization is red hot these days -- everyone, and his mother, is not only talking about but also investing in it.

But what happens to performance characteristics, such as throughput and response time, of applications that get shoved on a single physical server? This paper authored by VMware (and questioned by somecomparing VMware and XEN virtualization performance claims that the virtualization overhead of VMware is actually less than 5% for a variety of tasks. The results are based on observing runtime performance of a number of specialized benchmarks. But we all know how well such results published by the vendor itself correlate with real life observations!

So I was quite excited to find two papers published by IBM on performance characteristics of WebSphere Application Servers (WAS instances) running under VMware virtualization. One of these papers are actually available from VMware sites.

The first one, Using VMware ESX Server with IBM WebSphere Application Server, published on July 24, 2006, compares response time and throughput of a simple webapp running within a WAS instance deployed on physical machines and virtual machines under different configurations. The second one, Performance Characteristics of Virtualized Systems with the VMware ESX Server and Sizing Methodology for Consolidating Servers, extends the analysis and comes up with some useful guidelines on consolidated server sizing.

Although you must read the papers to really understand what was tested and under what conditions, I venture to state some of my consclusions/observations anyway:

  • As per the first paper, the performance overhead of running four Linux VMs, each running a WAS instance, compared to 4 instances of WAS natively running under a single copy of Linux on a 4-CPU system is not very signficant (throughput degrades from 224 pages/sec to 201 pages/sec under virtualization and the response time actually improves from 80ms to 78ms). However, increasing the number of VMs to 8 on the same 4-CPU machine degrades the response time signficantly (from 78 ms to 164ms).
  • The second paper reports only throughput (don't know why) on Windows VMs and compares throughput when a single instance of WAS is running natively and under VM on a 1-CPU and 2-CPU machines. The throughput under VM is 73% of the corresponding number on the native OS for a 1-CPU system and 69% for a 2-CPU system.

These two observations may appear to be conflicting at first but are not. Running multiple apps (WAS instances, in this case) under a single OS incurs signficant overhead, very similar to what multiple VMs incur under a hypervisor and in this case the hypervisor overhead itself may not be that significant. However, there is a definite overhead of the hypervisor for a single app running at full capacity within a single VM, as shown in the second paper.

These observations actually strengthen the argument for virtualization in server consolidation situations when apps don't fully utilize the infrastructure resources. At the same time, one must be careful about going the virtualization route for apps that need as much processing power as possible! Even then, the flexibility offered by virtualization may be worth the cost.

One thing I found missing from both the papers was discussion on memory utilization under different scenarios. I would assume sceanrios using VMs to use more memory than the one with native OS, for the hypervisor and the separate copies of OS will consume extra RAM. Also, WAS instances running in different VMs will have no opportunity to share code with other instances, as would be possible within a single OS image. This can't be good for memory hungry applications. Exactly how much memory goes towards these is hard to estimate as the hypervisor typically do perform sophisticated memory management and might be mapping different pages with same bytes read by different VMs to the same RAM locations.

I would also have preferred information on response time comparison with and without virtualization in the second paper. The response time is relevant even when the system is not fully loaded and a significant degradation due to virtualization would not be good news.

30% virtualization overhead reported in the second paper for 1-CPU and 2-CPU systems may not be of much consequnce to most enterprise data centers that prefer to run multi-CPU servers and average utilization is in the order of 10-15%. However, it is signficant if you are an ASP and plan to run farms of inexpensive 2-CPU machines at near full capacity. It is also signficant if you plan to rent computing power from a service like EC2 that uses some form of virtualization and making cost comparisons with native systems. I actually did this once for one of my hobby projects!

The key lesson, at least for me, is that you got to be careful about your deployment to minimize the overhead of virtualization. The best, of course, is to do perfoirmance testing under virtualization and identify the bottlenecks. More on this later!

Posted By Pankaj Kumar | No Comments | Trackbacks | Permalink


Comments

No Comments

Leave a Comment

(required)  
(optional)
(required)  


Type the digits above:
Information disclosed in this community becomes public. Exercise caution when deciding to disclose your personal information. HP reserves the right, but is not obligated to, edit or remove your comment if it contains personally identifiable information or other content HP deems unacceptable.  Opinions expressed are your personal opinions or those of the original authors, and not of HP. Please see HP's web Terms of Use for more details.