**Update** January 2013: You’ve probably found your way here by doing a search for ‘High CPU Ready in VMware’ or something to that effect. Welcome, I’m glad you’re here. Give this article a read as it still has relevant content, then check out this newer post on CPU Ready, with some handy quick reference charts and more detailed info on the condition: http://vmtoday.com/2013/01/cpu-ready-revisted-quick-reference-charts/. Thanks for reading! –Josh
I ran into an issue with a customer today where a VM was performing terribly. From within the guest OS (a Windows 2003 application server running .NET in IIS which I will call BigBadServer) things appeared sluggish and CPU time was high. The amount of time being spent on the kernel was notably high. The VM in question had 4 vCPU’s and a good helping of memory.
I don’t have access to the VMware client at this particular site – just some of the guests, so I was flying blind. Gut feeling told me that I was dealing with a resource contention issue. I had the VMstats provider running in the guest (http://vpivot.com/2009/09/17/using-perfmon-for-accurate-esx-performance-counters/) showed me that there was no ballooning or swapping going on, and that the vCPU’s were not limited and the CPU share value seemed to be at the default.
I strongly suspected that the physical server running VMware ESX was oversubscribed on physical CPU (pCPU) resources. Essentially, the guest VM’s that are sharing the resources of the physical machine are demanding more resources than the machine can handle. To verify this theory, I had the client check the ‘CPU Ready’ metric on BigBadServer and bingo!
CPU Ready is a measure of the amount of time that the guest VM is ready to run against the pCPU, but the VMware CPU Scheduler cannot find time to run the VM because other VM’s are competing for the same resources.
From the stats the customer provided on our phone call, the CPU Ready for any one of the 4 vCPU’s on the BigBadServer was on average 3723ms (min: 1269ms, max:8491ms). (Update 8/25/2010 to clarify summation stat) The summation for the entire VM was around 12,000ms on average and peaked around 35,000. The stats came from the real-time performance graph/table in the vSphere client. The real-time stats in the vSphere Client update every 20 seconds, so the CPU Ready summation value should be divided by 20,000 to get a percentage of CPU ready for the 20 second time slice. If I take the worst case scenario of 8491ms per vCPU, this VM spent nearly 43% (8491/20,000) of the 20 second time slice waiting for CPU resources.
Update: 10/10/2012: VMware published a KB article earlier this year that details the conversion of CPU ready from a summation to percentage. Find the article here: http://kb.vmware.com/kb/2002181.
The CPU Ready summation in milliseconds counter in the vCenter Client is not always the most accurate or easy to interpret stat – to better quantify the problem it might be best to go to the ESX command line and run ESXTOP. CPU Ready over 5% could be a sign of trouble, over 10% and there is a problem. Running ESXTOP in batch mode and then analyzing the output using Windows Perfmon or Excel might be a good way to go on this to get a view over several hours rather than the realtime stats we were looking at. I wrote a post a while back with more info on ESXTOP batch mode: http://vmtoday.com/2009/09/esxtop-batch-mode-windows-perfmon/
To help quantify the problem a bit more, the BigBadServer is on an ESX 4.0 server with about 10 other servers. The physical blade has two dual-core CPU’s (AMD Opteron 2218HE’s which are not hyperthreaded). The other VM’s on the blade have different vCPU and vMemory configurations. 3 VM’s (including BigBadServer) have 4 vCPU’s. A couple have 2 vCPU’s, and the remainder are configured with 1 vCPU. In ESX 4.x, the VMware console OS actually runs as a hidden VM, pegged to pCPU #1.
I generally recommend a pCPU:vCPU ration of 1:4 for mid-sized VMware deployments of single vCPU VM’s. The blade we are running on is a 1:5 with several multi-vCPU VM’s. The multi-vCPU’s start to skew the ratio recommendation and require some advanced design decisions. VMware’s scheduler requires that all the vCPU’s on a VM run concurrently (even if the Guest OS is trying to execute a single thread). Also, the VMware CPU Scheduler prefers to have all the vCPU’s from a VM run on the same pCPU. As workloads are bounced around between pCPU’s, the benefits of CPU cache are lost. This is one of those ‘more-is-less’ situations that you run into on virtualized environments.
What this CPU Scheduler nonsense means in this case is that the 4 vCPU’s on BigBadServer have to wait until all logical pCPU’s on the box are idle (including the one that runs ESX itself) before it can run. If ESX can’t accomplish that (we are experiencing resource contention) it starts prioritizing workloads according to what it can best run. It is much easier to schedule the smaller VM’s, so it tends to run those on pCPU more frequently. The larger VM’s tend to suffer a bit more than the smaller ones. We are competing with 2 other VM’s with 4 vCPU’s that use up all of the logical pCPU’s when they need to run, as well as with the smaller VM’s.
I suggested a few ways to fix this issue for the BigBadServer web server:
- Using Shares and/or Reservations on the VM. This probably won’t work in our situation as the physical server is too over-subscribed. We might see a slight improvement in BigBadServer (or we might not see any change), but possibly at the extreme expense of the other VM’s sharing the blade.
- Reduce the number of vCPU’s on BigBadServer AND the other multi-vCPU VM’s on the same physical server. This would reduce resource contention and open up a whole bunch of scheduling options for the VMware CPU Scheduler. This is the quickest/cheapest fix, but will not work if the VM’s really do need 4 vCPU’s. A little workload analysis should determine which can be made smaller (the vCenter server graphs/stats should be enough for this). For what it’s worth, by our analysis BigBadServer seems to be happier with 4 vCPU assuming we can run with a low CPU Ready on those 4.
- Move the BigBadServer VM to a physical ESX server with fewer multi-vCPU VM’s so there is less contention.
- Move the BigBadServer VM to a physical ESX server with quad-core pCPU’s (ideally two quad-cores or bigger). This would give a lot more flexibility to the VMware CPU Scheduler and allow it to run quad-vCPU VM’s on the same pCPU for greater efficiency.
- Split BigBadServer into 2 smaller VM’s – The server currently runs a couple sites. We could split them onto two servers – one for Project1 and one for Proejct2. This configuration would take some design, testing, and time but could scale out better, give more flexibility and availability in the long run.
I’m not sure which way the customer will go on this one yet, but I feel good having armed them with enough knowledge and options to make an informed decision.
To avoid problems like this in the future, I recommend these rules of thumb:
- Design your hosts for your guests. Taking your Guest VM sizes into account when designing your environment and choosing physical hardware is crucial if you need bigger VM’s.
- Don’t make your VM’s bigger than you have to. It is always easier to add resources than take them away. Hot Add of CPU and Memory in vSphere make adding incredibly easy.
- Monitor your environment for CPU Ready, Swapping, and other metrics that can indicate an inefficient design.
- Call for help when you can’t figure out what is going on (I’m happy to help!). VMware is super powerful, but some things can be downright backwards when it comes to resource allocation on a fixed set of hardware.
If you are looking for some resources to help explain CPU Scheduling a bit more, I recommend:
- VMware’s Official documentation of CPU Scheduler in vSphere 4.1 – http://www.vmware.com/files/pdf/techpaper/VMW_vSphere41_cpu_schedule_ESX.pdf.
- A nice summary of co-scheduling from VMware’s Performance Blog: http://blogs.vmware.com/performance/2008/06/esx-scheduler-s.html
- Description and stats on Ready Time metrics for VI3: http://www.vmware.com/pdf/esx3_ready_time.pdf
- Understanding Virtual Center Performance Statistics: http://communities.vmware.com/docs/DOC-5230.pdf
(Updated 8/25/2010 to include a few additional reference links and corrected summation divided by time slice to get accurate values)