**Update** January 2013: You’ve probably found your way here by doing a search for ‘High CPU Ready in VMware’ or something to that effect. Welcome, I’m glad you’re here. Give this article a read as it still has relevant content, then check out this newer post on CPU Ready, with some handy quick reference charts and more detailed info on the condition: https://vmtoday.com/2013/01/cpu-ready-revisted-quick-reference-charts/. Thanks for reading! –Josh
I ran into an issue with a customer today where a VM was performing terribly. From within the guest OS (a Windows 2003 application server running .NET in IIS which I will call BigBadServer) things appeared sluggish and CPU time was high. The amount of time being spent on the kernel was notably high. The VM in question had 4 vCPU’s and a good helping of memory.
I don’t have access to the VMware client at this particular site – just some of the guests, so I was flying blind. Gut feeling told me that I was dealing with a resource contention issue. I had the VMstats provider running in the guest (https://vpivot.com/2009/09/17/using-perfmon-for-accurate-esx-performance-counters/) showed me that there was no ballooning or swapping going on, and that the vCPU’s were not limited and the CPU share value seemed to be at the default.
I strongly suspected that the physical server running VMware ESX was oversubscribed on physical CPU (pCPU) resources. Essentially, the guest VM’s that are sharing the resources of the physical machine are demanding more resources than the machine can handle. To verify this theory, I had the client check the ‘CPU Ready’ metric on BigBadServer and bingo!
CPU Ready is a measure of the amount of time that the guest VM is ready to run against the pCPU, but the VMware CPU Scheduler cannot find time to run the VM because other VM’s are competing for the same resources.
From the stats the customer provided on our phone call, the CPU Ready for any one of the 4 vCPU’s on the BigBadServer was on average 3723ms (min: 1269ms, max:8491ms). (Update 8/25/2010 to clarify summation stat) The summation for the entire VM was around 12,000ms on average and peaked around 35,000. The stats came from the real-time performance graph/table in the vSphere client. The real-time stats in the vSphere Client update every 20 seconds, so the CPU Ready summation value should be divided by 20,000 to get a percentage of CPU ready for the 20 second time slice. If I take the worst case scenario of 8491ms per vCPU, this VM spent nearly 43% (8491/20,000) of the 20 second time slice waiting for CPU resources.
Update: 10/10/2012: VMware published a KB article earlier this year that details the conversion of CPU ready from a summation to percentage. Find the article here: https://kb.vmware.com/kb/2002181.
The CPU Ready summation in milliseconds counter in the vCenter Client is not always the most accurate or easy to interpret stat – to better quantify the problem it might be best to go to the ESX command line and run ESXTOP. CPU Ready over 5% could be a sign of trouble, over 10% and there is a problem. Running ESXTOP in batch mode and then analyzing the output using Windows Perfmon or Excel might be a good way to go on this to get a view over several hours rather than the realtime stats we were looking at. I wrote a post a while back with more info on ESXTOP batch mode: https://vmtoday.com/2009/09/esxtop-batch-mode-windows-perfmon/
To help quantify the problem a bit more, the BigBadServer is on an ESX 4.0 server with about 10 other servers. The physical blade has two dual-core CPU’s (AMD Opteron 2218HE’s which are not hyperthreaded). The other VM’s on the blade have different vCPU and vMemory configurations. 3 VM’s (including BigBadServer) have 4 vCPU’s. A couple have 2 vCPU’s, and the remainder are configured with 1 vCPU. In ESX 4.x, the VMware console OS actually runs as a hidden VM, pegged to pCPU #1.
I generally recommend a pCPU:vCPU ration of 1:4 for mid-sized VMware deployments of single vCPU VM’s. The blade we are running on is a 1:5 with several multi-vCPU VM’s. The multi-vCPU’s start to skew the ratio recommendation and require some advanced design decisions. VMware’s scheduler requires that all the vCPU’s on a VM run concurrently (even if the Guest OS is trying to execute a single thread). Also, the VMware CPU Scheduler prefers to have all the vCPU’s from a VM run on the same pCPU. As workloads are bounced around between pCPU’s, the benefits of CPU cache are lost. This is one of those ‘more-is-less’ situations that you run into on virtualized environments.
What this CPU Scheduler nonsense means in this case is that the 4 vCPU’s on BigBadServer have to wait until all logical pCPU’s on the box are idle (including the one that runs ESX itself) before it can run. If ESX can’t accomplish that (we are experiencing resource contention) it starts prioritizing workloads according to what it can best run. It is much easier to schedule the smaller VM’s, so it tends to run those on pCPU more frequently. The larger VM’s tend to suffer a bit more than the smaller ones. We are competing with 2 other VM’s with 4 vCPU’s that use up all of the logical pCPU’s when they need to run, as well as with the smaller VM’s.
I suggested a few ways to fix this issue for the BigBadServer web server:
- Using Shares and/or Reservations on the VM. This probably won’t work in our situation as the physical server is too over-subscribed. We might see a slight improvement in BigBadServer (or we might not see any change), but possibly at the extreme expense of the other VM’s sharing the blade.
- Reduce the number of vCPU’s on BigBadServer AND the other multi-vCPU VM’s on the same physical server. This would reduce resource contention and open up a whole bunch of scheduling options for the VMware CPU Scheduler. This is the quickest/cheapest fix, but will not work if the VM’s really do need 4 vCPU’s. A little workload analysis should determine which can be made smaller (the vCenter server graphs/stats should be enough for this). For what it’s worth, by our analysis BigBadServer seems to be happier with 4 vCPU assuming we can run with a low CPU Ready on those 4.
- Move the BigBadServer VM to a physical ESX server with fewer multi-vCPU VM’s so there is less contention.
- Move the BigBadServer VM to a physical ESX server with quad-core pCPU’s (ideally two quad-cores or bigger). This would give a lot more flexibility to the VMware CPU Scheduler and allow it to run quad-vCPU VM’s on the same pCPU for greater efficiency.
- Split BigBadServer into 2 smaller VM’s – The server currently runs a couple sites. We could split them onto two servers – one for Project1 and one for Proejct2. This configuration would take some design, testing, and time but could scale out better, give more flexibility and availability in the long run.
I’m not sure which way the customer will go on this one yet, but I feel good having armed them with enough knowledge and options to make an informed decision.
To avoid problems like this in the future, I recommend these rules of thumb:
- Design your hosts for your guests. Taking your Guest VM sizes into account when designing your environment and choosing physical hardware is crucial if you need bigger VM’s.
- Don’t make your VM’s bigger than you have to. It is always easier to add resources than take them away. Hot Add of CPU and Memory in vSphere make adding incredibly easy.
- Monitor your environment for CPU Ready, Swapping, and other metrics that can indicate an inefficient design.
- Call for help when you can’t figure out what is going on (I’m happy to help!). VMware is super powerful, but some things can be downright backwards when it comes to resource allocation on a fixed set of hardware.
If you are looking for some resources to help explain CPU Scheduling a bit more, I recommend:
- VMware’s Official documentation of CPU Scheduler in vSphere 4.1 – https://www.vmware.com/files/pdf/techpaper/VMW_vSphere41_cpu_schedule_ESX.pdf.
- A nice summary of co-scheduling from VMware’s Performance Blog: https://blogs.vmware.com/performance/2008/06/esx-scheduler-s.html
- Description and stats on Ready Time metrics for VI3: https://www.vmware.com/pdf/esx3_ready_time.pdf
- Understanding Virtual Center Performance Statistics: https://communities.vmware.com/docs/DOC-5230.pdf
(Updated 8/25/2010 to include a few additional reference links and corrected summation divided by time slice to get accurate values)
J Shelton says
I believe that Ready Time is a summation stat and that is the total Ready Time for the VM…not per vCPU. So, that’d be around 930ms per vCPU…still not great, but a ton better than 4 seconds.
Also…hard to believe or not…but Ready Time can be driven up by factors other than oversubscription. Power Savings, NUMA scheduling, and other factors driven by the coscheduler could raise ready times. You’re right on the money though that resxtop or esxtop is my preffered method for troubleshooting these types of issues.
Personally…I’m not going to lie, but I think the scheduler starts getting too busy with anything above a 2.5:1 ration of vCPUs to Physical Cores in my personal experience in busy environments.
Joshua Townsend says
James – Thanks for taking the time to comment – I really appreciate it!
You are correct that the Ready Time displayed in vCenter is a summation stat. You can view total CPU Ready for the entire VM as well as for individual processors. Each vCPU averaged 4 seconds. At one point allv CPU’s spiked to 8000+ milliseconds Ready Time, so the total VM summation reading was north of 32000 milliseconds! Talk about painful. I wish it was 932ms!
You make great points on the other things that can drive up CPU ready! Thanks for contributing.
The vSphere 4.0 and 4.1 CPU Scheduler have continued to increase efficiency, allowing a bit more flexibility/density. I would not go beyond a 5:1 ratio for any but the lightest workloads. I like a 2.5:1 or a 3:1 for a balance of performance and density.
Joshua Townsend says
I have added a few reference pieces to my list and corrected my explanation of the CPU Ready time summation counter to better explain what the stat means when viewed in the vSphere Client and how to convert to CPU Ready Percentage for a more telling measure. Did I miss anything else or get it wrong? Let me know and I will update the post…
Josh
Chris says
The relaxed co-scheduling feature means that it is not always necessary for the same number of physical cores as vCPUs to be ready for execution at precisely the same instance…
Joshua Townsend says
That’s right, Chris. Relaxed co-scheduling was introduced in ESX 3.5 and has been improved in updates to 3.5, as well as in 4.0 and 4.1. Having the same number of physical cores as vCPU’s is not a requirement any more, but is still preferential up to a point. It’s always good to have a basic understanding of the way the scheduler works and to design around it depending on your particular environment’s needs.
Ken says
IF CPU ready time is high, should I also expect to see a high processor Queue length in the guest OS? I am trying to determine methods for looking at a guest OS’s counters to determine if there is a greater performance problem within ESX. I am doing this for our support staff who works with customers that have installed our apps inside a VMware guest OS and are complaining about performance. I’m hoping to be able to point them to specific counters within the guest OS that can actually translate to something meaningful in the guest OS. I realize there are now 2 new counters installed in the guest OS with VMware tools,VM processor and VM memory, but I want to dig deeper if possible.
Mike says
You may want to correct the quite hilarious misspelling of “flying bling.” Good article–I’ve been trying to dial in on modeling subscription rates from performance stats and the 10% CPU ready is a good event trigger.
Joshua Townsend says
Thanks for the comment, Mike. Wish I could blame auto-correct for the typo….
Onder says
Hey,
you forgot 1 thing. The scheduler of ESX 4.0 does not work so effectively with large cu vCPU VMs as it is not putting the vCPUs on the same NUMA node. The vCPUs are spread out equally between pCPUs. So the high values of ready time are cause as well by waiting of for example vCPU#4 to get memory pages from vCPU#1 which is running on completely different pCPU.
NUMA node architecture optimization is fixed on scheduler of ESX 4.1.
Tom says
I hate having to be pedantic about stuff like this, but it appears in this article a lot…
vCPUs is the plural of vCPU
(and not vCPU’s, which is the possessive form, or perhaps short for “vCPU is”)
This applies to all abbreviations – there is no apostrophe in the plural.
(sorry, but it was doing my head in reading through an otherwise excellent and informative article)
Joshua Townsend says
Thanks, Tom. Great timing – I was actually thinking about the same last night with API (APIs, API’s), LUNs, and VMs after I published my last article. The APIs, LUNs, and VMs don’t own anything – hopefully 😉