<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
><channel><title>VMtoday &#187; esxtop</title> <atom:link href="http://vmtoday.com/tag/esxtop/feed/" rel="self" type="application/rss+xml" /><link>http://vmtoday.com</link> <description>VMware News, Views, &#38; How-To&#039;s from vExpert Josh Townsend</description> <lastBuildDate>Wed, 08 Feb 2012 20:33:54 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>High CPU Ready, Poor Performance</title><link>http://vmtoday.com/2010/08/high-cpu-ready-poor-performance/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=high-cpu-ready-poor-performance</link> <comments>http://vmtoday.com/2010/08/high-cpu-ready-poor-performance/#comments</comments> <pubDate>Wed, 25 Aug 2010 19:52:07 +0000</pubDate> <dc:creator>Joshua Townsend</dc:creator> <category><![CDATA[Issues & Troubleshooting]]></category> <category><![CDATA[VMware]]></category> <category><![CDATA[VMware How To]]></category> <category><![CDATA[best practices]]></category> <category><![CDATA[cpu ready]]></category> <category><![CDATA[esxtop]]></category> <category><![CDATA[performance]]></category> <category><![CDATA[troubleshooting]]></category> <category><![CDATA[vsphere]]></category><guid
isPermaLink="false">http://vmtoday.com/?p=566</guid> <description><![CDATA[I ran into an issue with a customer today where a VM was performing terribly.  From within the guest OS (a Windows 2003 application server running .NET in IIS which I will call BigBadServer) things appeared sluggish and CPU time was high.  The amount of time being spent on the kernel was notably high.  The [...]]]></description> <content:encoded><![CDATA[<p></p><p>I ran into an issue with a customer today where a VM was performing terribly.  From within the guest OS (a Windows 2003 application server running .NET in IIS which I will call BigBadServer) things appeared sluggish and CPU time was high.  The amount of time being spent on the kernel was notably high.  The VM in question had 4 vCPU’s and a good helping of memory.</p><p><a
href="http://cloudfront.vmtoday.com/wp-content/uploads/2010/08/highkerneltime.png" rel="lightbox[566]"><img
class="aligncenter size-medium wp-image-589" title="high kernel time" src="http://cloudfront.vmtoday.com/wp-content/uploads/2010/08/highkerneltime-220x300.png" alt="high kernel time in perfmon" width="220" height="300" /></a></p><p>I don’t have access to the VMware client at this particular site – just some of the guests, so I was flying blind.  Gut feeling told me that I was dealing with a resource contention issue.  I had the VMstats provider running in the guest (<a
href="http://vpivot.com/2009/09/17/using-perfmon-for-accurate-esx-performance-counters/">http://vpivot.com/2009/09/17/using-perfmon-for-accurate-esx-performance-counters/</a>) showed me that there was no ballooning or swapping going on, and that the vCPU’s were not limited and the CPU share value seemed to be at the default.</p><p>I strongly suspected that the physical server running VMware ESX was oversubscribed on physical CPU (pCPU) resources.  Essentially, the guest VM’s that are sharing the resources of the physical machine are demanding more resources than the machine can handle.  To verify this theory, I had the client check the ‘CPU Ready’ metric on BigBadServer and bingo!</p><p>CPU Ready is a measure of the amount of time that the guest VM is ready to run against the pCPU, but the VMware CPU Scheduler cannot find time to run the VM because other VM’s are competing for the same resources.</p><p>From the stats the customer provided on our phone call, the CPU Ready for any one of the 4 vCPU’s on the BigBadServer was on average 3723ms (min: 1269ms, max:8491ms).  (Update 8/25/2010 to clarify summation stat) The summation for the entire VM was around 12,000ms on average and peaked around 35,000.  The stats came from the real-time performance  graph/table in the vSphere client. The real-time stats in the vSphere Client update every 20 seconds, so  the CPU Ready summation value  should be divided by 20,000 to get a  percentage of CPU ready for the 20 second time slice.  If I take the  worst case scenario of 8491ms per vCPU, this VM spent nearly 43%  (8491/20,000) of the 20 second time slice waiting for CPU resources.</p><p>The CPU Ready summation in milliseconds counter in the vCenter Client is not always the most accurate or easy to interpret stat – to better quantify the problem it might be best to go to the ESX command line and run ESXTOP.  CPU Ready over 5% could be a sign of trouble, over 10% and there is a problem.  Running ESXTOP in batch mode and then analyzing the output using Windows Perfmon or Excel might be a good way to go on this to get a view over several hours rather than the realtime stats we were looking at.  I wrote a post a while back with more info on ESXTOP batch mode: <a
href="../2009/09/esxtop-batch-mode-windows-perfmon/">http://vmtoday.com/2009/09/esxtop-batch-mode-windows-perfmon/</a></p><p>To help quantify the problem a bit more, the BigBadServer is on an ESX 4.0 server with about 10 other servers.  The physical blade has two dual-core CPU’s (AMD Opteron 2218HE’s which are not hyperthreaded).  The other VM’s on the blade have different vCPU and vMemory configurations.  3 VM’s (including BigBadServer) have 4 vCPU’s.  A couple have 2 vCPU’s, and the remainder are configured with 1 vCPU.  In ESX 4.x, the VMware console OS actually runs as a hidden VM, pegged to pCPU #1.</p><p>I generally recommend a pCPU:vCPU ration of 1:4 for mid-sized VMware deployments of single vCPU VM’s.  The blade we are running on is a 1:5 with several multi-vCPU VM’s.  The multi-vCPU’s start to skew the ratio recommendation and require some advanced design decisions.  VMware’s scheduler requires that all the vCPU’s on a VM run concurrently (even if the Guest OS is trying to execute a single thread).  Also, the VMware CPU Scheduler prefers to have all the vCPU’s from a VM run on the same pCPU.  As workloads are bounced around between pCPU’s, the benefits of CPU cache are lost.  This is one of those ‘<a
title="Balloon Driver Problems with SQL" href="http://vmtoday.com/2009/09/balloon-driver-problems-with-sql/">more-is-less</a>’ situations that you run into on virtualized environments.</p><p>What this CPU Scheduler nonsense means in this case is that the 4 vCPU’s on BigBadServer have to wait until all logical pCPU’s on the box are idle (including the one that runs ESX itself) before it can run.  If ESX can’t accomplish that (we are experiencing resource contention) it starts prioritizing workloads according to what it can best run.  It is much easier to schedule the smaller VM’s, so it tends to run those on pCPU more frequently.  The larger VM’s tend to suffer a bit more than the smaller ones.  We are competing with 2 other VM’s with 4 vCPU’s that use up all of the logical pCPU’s when they need to run, as well as with the smaller VM’s.</p><p>I suggested a few ways to fix this issue for the BigBadServer web server:</p><ol><li>Using Shares and/or Reservations on the VM.  This probably won’t work in our situation as the physical server is too over-subscribed.  We might see a slight improvement in BigBadServer (or we might not see any change), but possibly at the extreme expense of the other VM’s sharing the blade.</li><li>Reduce the number of vCPU’s on BigBadServer AND the other multi-vCPU VM’s on the same physical server.  This would reduce resource contention and open up a whole bunch of scheduling options for the VMware CPU Scheduler.  This is the quickest/cheapest fix, but will not work if the VM’s really do need 4 vCPU’s.  A little workload analysis should determine which can be made smaller (the vCenter server graphs/stats should be enough for this).  For what it’s worth, by our analysis BigBadServer seems to be happier with 4 vCPU assuming we can run with a low CPU Ready on those 4.</li><li>Move the BigBadServer VM to a physical ESX server with fewer multi-vCPU VM’s so there is less contention.</li><li>Move the BigBadServer VM to a physical ESX server with quad-core pCPU’s (ideally two quad-cores or bigger).  This would give a lot more flexibility to the VMware CPU Scheduler and allow it to run quad-vCPU VM’s on the same pCPU for greater efficiency.</li><li>Split BigBadServer into 2 smaller VM’s – The server currently runs a couple sites.  We could split them onto two servers &#8211; one for Project1 and one for Proejct2.  This configuration would take some design, testing, and time but could scale out better, give more flexibility and availability in the long run.</li></ol><p>I’m not sure which way the customer will go on this one yet, but I feel good having armed them with enough knowledge and options to make an informed decision.</p><p>To avoid problems like this in the future, I recommend these rules of thumb:</p><ul><li>Design your hosts for your guests.  Taking your Guest VM sizes into account when designing your environment and choosing physical hardware is crucial if you need bigger VM’s.</li><li>Don’t make your VM’s bigger than you have to.  It is always easier to add resources than take them away.  Hot Add of CPU and Memory in vSphere make adding incredibly easy.</li><li>Monitor your environment for CPU Ready, Swapping, and other metrics that can indicate an inefficient design.</li><li>Call for help when you can’t figure out what is going on (I’m happy to help!).  VMware is super powerful, but some things can be downright backwards when it comes to resource allocation on a fixed set of hardware.</li></ul><p>If you are looking for some resources to help explain CPU Scheduling a bit more, I recommend:</p><ul><li>VMware’s Official documentation of CPU Scheduler in      vSphere 4.1 &#8211; <a
href="http://www.vmware.com/files/pdf/techpaper/VMW_vSphere41_cpu_schedule_ESX.pdf">http://www.vmware.com/files/pdf/techpaper/VMW_vSphere41_cpu_schedule_ESX.pdf</a>.</li><li>A nice summary of co-scheduling from VMware’s      Performance Blog: <a
href="http://blogs.vmware.com/performance/2008/06/esx-scheduler-s.html">http://blogs.vmware.com/performance/2008/06/esx-scheduler-s.html</a></li><li>Description and stats on Ready Time metrics for VI3: <a
title="VMware Performance Study on Ready Time Observations" href="http://www.vmware.com/pdf/esx3_ready_time.pdf" target="_blank">http://www.vmware.com/pdf/esx3_ready_time.pdf</a></li><li>Understanding Virtual Center Performance Statistics: <a
title="Understanding Virtual Center Performance Statistics" href="http://communities.vmware.com/docs/DOC-5230.pdf" target="_blank">http://communities.vmware.com/docs/DOC-5230.pdf</a></li></ul><p>(Updated 8/25/2010 to include a few additional reference links and corrected summation divided by time slice to get accurate values)</p><div
class="shr-publisher-566"></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div
class='shareaholic-like-buttonset' style='float:none;height:30px;'><a
class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2010%2F08%2Fhigh-cpu-ready-poor-performance%2F' data-shr_title='High+CPU+Ready%2C+Poor+Performance'></a><a
class='shareaholic-fbsend' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2010%2F08%2Fhigh-cpu-ready-poor-performance%2F'></a><a
class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2010%2F08%2Fhigh-cpu-ready-poor-performance%2F' data-shr_title='High+CPU+Ready%2C+Poor+Performance'></a></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div>]]></content:encoded> <wfw:commentRss>http://vmtoday.com/2010/08/high-cpu-ready-poor-performance/feed/</wfw:commentRss> <slash:comments>10</slash:comments> </item> <item><title>Storage Basics &#8211; Part VI: Storage Workload Characterization</title><link>http://vmtoday.com/2010/04/storage-basics-part-vi-storage-workload-characterization/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=storage-basics-part-vi-storage-workload-characterization</link> <comments>http://vmtoday.com/2010/04/storage-basics-part-vi-storage-workload-characterization/#comments</comments> <pubDate>Thu, 08 Apr 2010 16:34:44 +0000</pubDate> <dc:creator>Joshua Townsend</dc:creator> <category><![CDATA[Storage]]></category> <category><![CDATA[Storage Basics]]></category> <category><![CDATA[VMware]]></category> <category><![CDATA[esxtop]]></category> <category><![CDATA[I/O]]></category> <category><![CDATA[IOPS]]></category> <category><![CDATA[performance]]></category> <category><![CDATA[SAN]]></category> <category><![CDATA[vscsiStats]]></category><guid
isPermaLink="false">http://vmtoday.com/?p=308</guid> <description><![CDATA[Most of what I covered in Storage Basics Parts 1 through 5 was at a very elementary level.  The math I used to do IOPS calculations, for example, is only true under very certain conditions.  RAID controllers implement caching and other techniques that skew the simple math that I provided.  I mentioned that the type [...]]]></description> <content:encoded><![CDATA[<p></p><p>Most of what I covered in Storage Basics Parts 1 through 5 was at a very elementary level.  The math I used to do <a
title="Storage Basics – Part II: IOPS" href="http://vmtoday.com/2009/12/storage-basics-part-ii-iops/">IOPS calculations</a>, for example, is only true under very certain conditions.  <a
title="Storage Basics – Part III: RAID" href="http://vmtoday.com/2010/01/storage-basics-part-iii-raid/">RAID</a> controllers implement <a
title="Storage Basics – Part V: Controllers, Cache and Coalescing" href="http://vmtoday.com/2010/03/storage-basics-part-v-controllers-cache-and-coalescing/">caching</a> and other techniques that skew the simple math that I provided.  I mentioned that the type of <a
title="Storage Basics - Part IV: Interface" href="http://vmtoday.com/2010/01/storage-basics-part-iv-interface/">interface</a> that you ought to use on your storage array should not be randomly chosen.  In fact, choosing the right array with the appropriate components and characteristics can only be done when you enlighten your decision with a characterization of workloads it will be running.</p><p>The character of your storage workload can be broken down into several traits &#8211; random vs. sequential I/O, large vs. small I/O request size, read vs. write ratio, and degree of parallelism.  The traits of your particular workload dictate how it interacts with the components of your storage system and ultimately determine the performance of your environment under a given configuration.  There is an excellent whitepaper available from VMware entitled &#8220;<a
title="Easy and Efficient Disk I/O Workload Characterization in VMware ESX Server" href="http://www.vmware.com/files/pdf/iiswc_2007_distribute.pdf" target="_blank">Easy and Efficient Disk I/O Workload Characterization inVMware ESX Server</a>&#8221; that is authoritative on this subject.  If you want to get down and dirty with the topic, it&#8217;s a good read.  I&#8217;m aiming for something a bit less academic.  With that said, let&#8217;s break down workload characterization a bit so as to better understand how it will impact your real-world systems.</p><p><strong>Random vs. Sequential Access</strong></p><p>In <a
title="Storage Basics – Part II: IOPS" href="http://vmtoday.com/2009/12/storage-basics-part-ii-iops/">Part II</a> of this series we looked at the formula for calculating IOPS capabilities for a single disk.  That formula goes something like this:</p><blockquote><p>IOPS = 1000/(Seek Latency + Rotational Latency)</p></blockquote><p>You&#8217;ll recall that we divide into 1000 to remove milliseconds from the equation, leaving (Seek Latency + Rotational Latency) as the important part of the equation.  Rotational latency is based on the spindle speed of the disk &#8211; 7.2k, 10k, or 15k RPM for standard server or SAN disks.  If we consider<a
title="Cheetah® 15K.7 Hard Drive Technical Specifications" href="http://www.seagate.com/www/en-us/products/servers/cheetah/cheetah_15k.7/#tTabContentSpecifications" target="_blank"> the same Seagate Cheetah 15k drive from Part II</a>, we see that rotational latency is 2.0ms.  The only way to change rotational latency is to buy faster (or slower) disks.  This essentially leaves seek latency as the only variable that we can &#8220;adjust&#8221;.  You&#8217;ll also recall that seek latency was the larger of the latencies (3.4ms for read seeks, and 3.9ms for write seeks) and counts more against IOPS capability than does rotational latency.  Seeking is the most expensive operation in terms of performance.</p><p>It is next to impossible to adjust seek latency on a disk because it is determined by the speed of the servos that move the heads across the platter.  We can, however, send workloads with different degrees of randomness to the platter.  The more sequential a workload is, the less time that will be spent in seek operations.  A high degree of sequentiality ultimately leads to faster disk response and higher throughput rates.  Sequential workloads may be candidates for slower disks or RAID levels.  Conversely, workloads that are highly randomized ought to be placed on fast spindles in fast RAID configurations.</p><p>You&#8217;ll notice that I said it was next to impossible to adjust seek latency on a disk.  While not common, some storage administrators employ a method know as &#8216;short stroking&#8217; when configuring storage.  Short stroking uses less than the full capacity of the disk by placing data at the beginning of the disk where access is faster, and not placing data at the end of the disk where seeks times are greater.  This results in a smaller area on the disk platter for heads to travel over, effectively reducing seek time at the expense of capacity.</p><p>While not applicable to all workloads, storage arrays, or file systems, fragmentation can cause higher degrees of randomness leading to degraded  performance.  This is the prime reason some vendors recommend that you regularly defragment your file system.  It should be noted that a VMware VMFS file system is resilient against the forces of fragmentation.  Whereas a Windows NTFS parition may hold hundreds, thousands or tens of thousands of files of different sizes, accessed randomly throughout the system&#8217;s cycle of operations, a VMFS datastore  typically holds no more than a couple hundred files.  Additionally, most of the files on a VMFS datastore are created contiguously if you are using thick-provisioned virtual disks (VMDK).  Thin-provisioned VMDK&#8217;s are slightly more susceptible to fragmentation, but do not typically suffer a high enough degree of fragmentation to register a performance impact.  See this VMware whitepaper for more on VMFS fragmentation: <a
title="Performance Study of VMware vStorage Thin Provisioning" href="http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf" target="_blank">Performance Study of VMware vStorage Thin Provisioning</a>.</p><p>Examples of sequential workloads include backup-to-disk operations and the writing of SQL transaction log files.  Random workloads may include collective reads from Exchange Information Stores or OLTP database access.  Workloads are often a mix of random and sequential access, as is the case with most VMware vSphere implmentations.  The degree to which they are random or sequential dictates the type of tuning you should perform to obtain the best possible performance for your environment.</p><p><strong>I/O Request Size</strong></p><p>I/O request size is another important factor in workload characterization.  Generally speaking, larger reads/writes are more efficient than smaller I/O to a certain point.  The use of larger I/O requests (64KB instead of 2KB, for example) can result in faster throughput and reduced processor time.  Most workloads do not allow you to adjust your I/O request size.  However, knowing your I/O request size can help with appropriate configuration of certain parameters such as array stripe size and file system cluster size.  Check with your storage vendor for more information as it pertains to your specific configuration.</p><p>If you are in a Windows shop, you can use perfmon counters such as Avg. Disk Bytes/Read to determine average I/O size.  If you are running a VMware-virtualized workload, you can take advantage of a great tool &#8211; vscsiStats &#8211; to identify your I/O request size.  More on vscsiStats later in this article.</p><p><strong>Read vs. Write</strong></p><p>Every workload will display a differing amount of read and write activity.  Sometimes a specific workload, say Microsoft Exchange, can be broken down into sub-workloads for logging (write-heavy) and reading the database (read-heavy).  Understanding the read-to-write ratio may help with designing the underlying storage system.  For example, a write-heavy workload may perform better on a RAID10 LUN than a RAID5 array due to the write penalty associated with RAID5.  The ratio of read:write may also dictate caching strategies.  The read:write ratio, when combined with a degree of randomness measure, can be quite useful in architecting your storage strategy for a given application or workload.</p><p><strong>Parallelism/Outstanding I/O&#8217;s</strong></p><p>Some workloads are capable of performing multi-threaded I/O.  These types of workloads can place a higher amount of stress on the storage system and should be understood when designing storage, both in terms of IOPS and throughput.  Multipathing may help with multi-threaded I/O workloads.  A typical VMware vSphere environment is a good example of a workload capable of queuing up outstanding I/O.</p><p><strong>Measuring the Characteristics of Your Workload</strong></p><p>So how do we actually characterize storage workloads?  Start with the application vendor &#8211; many have published studies that can shed light on specific storage workloads in a standard implementation.  If you are interested in measuring your own for planning/architecture reasons, or performance troubleshooting reasons, read on&#8230;.  There are several tools to measure storage characteristics, depending on your operating system and storage environment.  Standard OS performance counters, such as Windows Performance Monitor (perfmon) can reveal some of the characteristics.  Array based tools such as NaviAnalyzer on EMC gear can also reveal statistics on the storage end of the equation.</p><p>One of the most exciting tools for storage workload characterization comes from VMware in the form of <em><strong>vscsiStats</strong></em>.  vscsiStats is a tool that has been included in VMware ESX server since version 3.5.  Because all I/O commands pass through the Virtual Machine Monitor (VMM), the hypervisor can inspect and report on the I/O characteristics of a particular workload, down to a unique VM running on an ESX host.  There is a ton of great information on using vscsiStats, so I won&#8217;t re-hash it all here.  I recommend starting with <a
title="Using vscsiStats for Storage Performance Analysis" href="http://communities.vmware.com/docs/DOC-10095" target="_blank">Using vscsiStats for Storage Performance Analysis</a> as it contains an overview and usage instructions.  If you want to dig a bit deeper into vscsiStats, read both <a
title="Storage Workload Characterization and Consolidation in Virtualized Enviornments" href="http://communities.vmware.com/docs/DOC-10104" target="_blank">Storage Workload Characterization and Consolidation in Virtualized Environments</a> and <a
title="vscsiStats: Fast and Easy Disk Workload Characterization on VMware ESX Server" href="http://communities.vmware.com/docs/DOC-10084" target="_blank">vscsiStats: Fast and Easy Disk Workload Characterization on VMware ESX Server</a>.</p><p>vscsiStats can generate an enormous amount of data which is best viewed as a histogram.  If you&#8217;re a glutton for punishment, the data can be reviewed manually on the COS.  To extract vscsiStat output data, use the -c option to export to a .csv file.  From there you can analyze the data and create histograms using Excel.  Paul Dunn has a nifty Excel macro for analyzing and reporting on vscsiStats output <a
title="New vscsiStats Excel Macro" href="http://dunnsept.wordpress.com/2010/03/11/new-vscsistats-excel-macro/">here</a>.  Gabrie van Zanten more detailed instructions for using Paul&#8217;s macro <a
title="Converting vscsiStats data into Excel charts" href="http://www.gabesvirtualworld.com/converting-vscsistats-data-into-excel-charts/">here</a>.  Here are a couple histogram examples that I just generated from a test VM.</p><p><a
href="http://cloudfront.vmtoday.com/wp-content/uploads/2010/04/IO-lengths.png" rel="lightbox[308]"><img
class="alignnone size-medium wp-image-500" title="IO lengths" src="http://cloudfront.vmtoday.com/wp-content/uploads/2010/04/IO-lengths-300x218.png" alt="IO Lengths Histogram" width="300" height="218" /></a> <a
href="http://cloudfront.vmtoday.com/wp-content/uploads/2010/04/IODistance.png" rel="lightbox[308]"><img
class="alignnone size-medium wp-image-501" title="IODistance" src="http://cloudfront.vmtoday.com/wp-content/uploads/2010/04/IODistance-300x218.png" alt="IO Distance Between Commands" width="300" height="218" /></a></p><p>vscsiStats is only included with ESX, not ESXi.  However, Scott Drummond was kind enough to post a download of vscsiStats for ESXi on his Virtual Pivot blog: <a
href="http://vpivot.com/2009/10/21/vscsistats-for-esxi/">http://vpivot.com/2009/10/21/vscsistats-for-esxi/</a>.  Using vscsiStats on ESXi requires dropping into Tech Support Mode (unsupported) and enabling ESXi for scp to transfer the binary to the ESXi server.</p><p>VMware <strong><em>esxtop</em></strong> can display some information but is limited in scope and does not currently support NFS.  A<a
title="Script to display NFS Stats per-VMDK" href="http://communities.vmware.com/thread/246837" target="_blank"> community-supported python script</a> called nfstop can parse vscsiStats data and display esxtop-like data per VM on screen.</p><p><strong>Experiment</strong></p><p>If you are interested in generating workloads with various characteristics, check out <a
title="Iometer.org" href="http://www.iometer.org/" target="_blank">Iometer</a> and <a
title="Bonnie++" href="http://www.coker.com.au/bonnie++/" target="_blank">Bonnie++</a>.  These tools will allow you to generate I/O that you can monitor with the tools I covered in this article.</p><p><strong>Put it to Use</strong></p><p>If you are provisioning a new workload or expanding an existing, invest some time in understanding your storage workload characteristics and convey those characteristics to your storage team.  A request for storage that includes the workload characteristics I discussed here, as well as expected IOPS requirements, will go much further in ensuring performance for your applications &#8211; physical or virtual &#8211; than simply asking for a certain capacity of disk.</p><p><strong><br
/> </strong></p><div
class="shr-publisher-308"></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div
class='shareaholic-like-buttonset' style='float:none;height:30px;'><a
class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2010%2F04%2Fstorage-basics-part-vi-storage-workload-characterization%2F' data-shr_title='Storage+Basics+-+Part+VI%3A+Storage+Workload+Characterization'></a><a
class='shareaholic-fbsend' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2010%2F04%2Fstorage-basics-part-vi-storage-workload-characterization%2F'></a><a
class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2010%2F04%2Fstorage-basics-part-vi-storage-workload-characterization%2F' data-shr_title='Storage+Basics+-+Part+VI%3A+Storage+Workload+Characterization'></a></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div>]]></content:encoded> <wfw:commentRss>http://vmtoday.com/2010/04/storage-basics-part-vi-storage-workload-characterization/feed/</wfw:commentRss> <slash:comments>6</slash:comments> </item> <item><title>The Skinny on ESXTOP</title><link>http://vmtoday.com/2009/09/the-skinny-on-esxtop/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-skinny-on-esxtop</link> <comments>http://vmtoday.com/2009/09/the-skinny-on-esxtop/#comments</comments> <pubDate>Thu, 17 Sep 2009 22:39:01 +0000</pubDate> <dc:creator>Joshua Townsend</dc:creator> <category><![CDATA[Issues & Troubleshooting]]></category> <category><![CDATA[VMware]]></category> <category><![CDATA[VMware How To]]></category> <category><![CDATA[analysis]]></category> <category><![CDATA[analyze]]></category> <category><![CDATA[batch mode]]></category> <category><![CDATA[cpu]]></category> <category><![CDATA[disk]]></category> <category><![CDATA[ESX]]></category> <category><![CDATA[esxi]]></category> <category><![CDATA[esxtop]]></category> <category><![CDATA[memory]]></category> <category><![CDATA[network]]></category> <category><![CDATA[performances]]></category> <category><![CDATA[rCLI]]></category> <category><![CDATA[resxtop]]></category> <category><![CDATA[statistics]]></category> <category><![CDATA[vCLI]]></category> <category><![CDATA[vMA]]></category> <category><![CDATA[vsphere]]></category><guid
isPermaLink="false">http://vmtoday.com/?p=244</guid> <description><![CDATA[A reader named Mark contacted me today and asked if there was a way to reduce the size of the batch output from an ESXTOP run.  And he asks for good reason: Depending on the number of VM&#8217;s on your host, the delay between ESXTOP samplings and the number of samples you collect, using the [...]]]></description> <content:encoded><![CDATA[<p></p><p>A reader named Mark contacted me today and asked if there was a way to reduce the size of the batch output from an ESXTOP run.  And he asks for good reason: Depending on the number of VM&#8217;s on your host, the delay between ESXTOP samplings and the number of samples you collect, using the All Stats option (-a) can yield a massive file in a short period of time.  If written to a partition on your ESX Service Console you run the risk of filling the partition, and forget about actually being able to analyze the data in PERFMON or Excel.  For example, on an ESX host running ~15 VM&#8217;s I produced 100MB worth of CSV using the -a switch, sampling every 15 seconds, for just under 2 hours.  ESXTOP uses 10-second intervals by default; I used <span
style="color: #993300;">-d 15</span> to change the sampling delay.  Had I went with the default my output would have been bigger.</p><p>To reduce the size of your output, you can change your sampling delay to something larger, say 30-seconds.  I suppose you could also capture statistics when the host is not busy so you get fewer characters in the results, but that&#8217;s just being goofy. <img
src='http://cloudfront.vmtoday.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /></p><p>A better way to reduce your ESXTOP output size is to selectively include only the statistics you are interested in, and is really what Mark was asking.  After all, all statistics from ESXTOP can be too many statistics, and chances are you already know what stats you are interested in.  Here&#8217;s how you can narrow down the collected stats for easier analysis and smaller output:</p><ol><li>Enter ESXTOP in interactive mode on the Service Console by simply typing <span
style="color: #993300;">esxtop</span> at the # prompt</li><li>Switch to a component you are NOT interested in capturing statistics on by pressing the corresponding menu option (<span
style="color: #993300;">c</span>: ESX cpu, <span
style="color: #993300;">m</span>: ESX memory, <span
style="color: #993300;">d</span>: ESX disk adapter, <span
style="color: #993300;">u</span>: ESX disk device, <span
style="color: #993300;">v</span>: ESX disk VM).</li><li>Press <span
style="color: #993300;">f</span> when viewing the component you do not want to capture.  A list of fields will be displayed.  You can toggle the fields on and off by pressing the letter corresponding to each field.  An * indicates that the field is on.  You want to turn off all of the fields you don&#8217;t want to collect.</li><li>Repeat steps 2 &amp; 3 for the remaining components, leaving only what you want to capture.</li><li>Switch to the component you want to capture in batch mode and repeat step #3, except you will now enable what you want to capture.</li><li>Press <span
style="color: #993300;">W</span> (capital W &#8211; case sensitive) to write out the ESXTOP configuration file.  You can accept the default or create new configuration files.  You may want to create a CPU-only config file, memory-only, and so forth.</li><li>Press <span
style="color: #993300;">CTRL+C</span> to stop ESXTOP.</li><li>Now, invoke ESXTOP in batch mode, calling your updated or new configuration file you created in step #6 using the -c switch.  Here&#8217;s an example:# <span
style="color: #993300;">esxtop -b -d 30 -n 480 -c .esxtopcpustats &gt; /tmp/esxtop_cpu_stats.cs</span><span
style="color: #993300;">v</span> where .esxtopcpustats is an ESXTOP config file with only CPU stats.  -d sets your capture interval to 30 seconds, and -n sets the number of samples to 480 (or 4 hours with a delay of 30 seconds).</li></ol><p>Once your capture is complete you can replay the sampling in ESXTOP using replay mode (-R), or you can copy the .csv to a Windows system and use PERFMON or Excel to analyze the stats.  If using PERFMON or Excel you will notice that the system summary information displayed at the top of an interactive ESXTOP session is included in the output (console memory, console cpu, etc.).  As far as I know, there is no way to disable this, nor would you want to as it includes the time stamp necessary to interpret your data.</p><p>It is possible to use the <a
title="vSphere CLI" href="http://communities.vmware.com/community/vmtn/vsphere/automationtools/vsphere_cli" target="_blank">vSphere CLI</a> or the <a
title="vSphere Management Assistant vMA" href="http://www.vmware.com/support/developer/vima/" target="_blank">vSphere Management Assistant (vMA)</a> to run RESXTOP, a version of ESXTOP designed for remote administration of ESXi or ESX.  You may note, however, RESXTOP from the vSphere CLI only works from a Linux client.  Using either of these tools will help you to automate ESXTOP statistics collection from multiple hosts using customized configuration files.</p><div
class="shr-publisher-244"></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div
class='shareaholic-like-buttonset' style='float:none;height:30px;'><a
class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2009%2F09%2Fthe-skinny-on-esxtop%2F' data-shr_title='The+Skinny+on+ESXTOP'></a><a
class='shareaholic-fbsend' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2009%2F09%2Fthe-skinny-on-esxtop%2F'></a><a
class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2009%2F09%2Fthe-skinny-on-esxtop%2F' data-shr_title='The+Skinny+on+ESXTOP'></a></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div>]]></content:encoded> <wfw:commentRss>http://vmtoday.com/2009/09/the-skinny-on-esxtop/feed/</wfw:commentRss> <slash:comments>6</slash:comments> </item> <item><title>ESXTOP Batch Mode &amp; Windows Perfmon</title><link>http://vmtoday.com/2009/09/esxtop-batch-mode-windows-perfmon/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=esxtop-batch-mode-windows-perfmon</link> <comments>http://vmtoday.com/2009/09/esxtop-batch-mode-windows-perfmon/#comments</comments> <pubDate>Thu, 10 Sep 2009 14:24:22 +0000</pubDate> <dc:creator>Joshua Townsend</dc:creator> <category><![CDATA[Issues & Troubleshooting]]></category> <category><![CDATA[VMware]]></category> <category><![CDATA[VMware How To]]></category> <category><![CDATA[ESX]]></category> <category><![CDATA[esxtop]]></category> <category><![CDATA[I/O]]></category> <category><![CDATA[perfmon]]></category> <category><![CDATA[performance]]></category> <category><![CDATA[sizing]]></category> <category><![CDATA[statistics]]></category> <category><![CDATA[Storage]]></category><guid
isPermaLink="false">http://vmtoday.com/?p=192</guid> <description><![CDATA[I needed to grab some stats from my ESX hosts for off-line analysis so I fired up my trusty ESXTOP intent on using batch mode to capture a .csv formatted output.  I started to manually select the counters I was interested in while working in ESXTOP interactive mode (you can save your selected counters to [...]]]></description> <content:encoded><![CDATA[<p></p><p>I needed to grab some stats from my ESX hosts for off-line analysis so I fired up my trusty ESXTOP intent on using batch mode to capture a .csv formatted output.  I started to manually select the counters I was interested in while working in ESXTOP interactive mode (you can save your selected counters to the esxtop configuration file with the &#8216;w&#8217; command) and thought that there must be a better way.  I found that better way in the VMware Performance Community: <a
title="http://communities.vmware.com/docs/DOC-3930" href="http://communities.vmware.com/docs/DOC-3930">http://communities.vmware.com/docs/DOC-3930</a>.  There is now a -a switch that can be used to include ALL performance counters.  I&#8217;m sold.</p><p>I wanted detailed information, so I decided on a 15 second capture interval to run for a 2 hour window.  Here&#8217;s the command I used:</p><blockquote><p>esxtop -a -b -d 15 -n 480 &gt; /tmp/esxtopout.csv</p></blockquote><p>where -a is for ALL, -b is for batch mode, -d is for delay, and -n is for the number of iterations ((60/15)*60*2).  I wrote out the results to a .csv in /tmp.  The resulting CSV weighed in at a whopping 100MB after 2 hours.</p><p>The CSV can be analyzed in Excel (pivot tables work well for this) or in Windows Perfmon.  I opened the log in Perfmon as I was after basic Min/Average/Max counters and Perfmon makes those easy to see.  When adding the CSV log to Perfmon, you are prompted to select counters.  I added all instances of Commands/sec, Reads/sec, and Writes/sec from Physical Disk (I was gathering some IOPS counts for a new storage proposal). I got a bit more than I bargained for: a mostly unresponsive Perfmon window and the ugliest darn graph I&#8217;ve ever seen.</p><p><a
href="http://cloudfront.vmtoday.com/wp-content/uploads/2009/09/image.png" rel="lightbox[192]"><img
style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" src="http://cloudfront.vmtoday.com/wp-content/uploads/2009/09/image_thumb.png" border="0" alt="image" width="420" height="313" /></a></p><p>Switching from a graph view to the report view allows you to easily view and remove specific counters that you are not interested in, or open the Properties of the data set, switch to the data tab and bulk select counters that you want to remove.  I was not interested in vmhba1:x, specific VM&#8217;s or worlds, so I killed all of those, leaving just the base iSCSI device (vmhba32 in my case).</p><p>After some cleanup the graph looked a bit better and more importantly, I was able to easily read my Min/Average/Max stats:</p><p><a
href="http://cloudfront.vmtoday.com/wp-content/uploads/2009/09/image1.png" rel="lightbox[192]"><img
style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" src="http://cloudfront.vmtoday.com/wp-content/uploads/2009/09/image_thumb1.png" border="0" alt="image" width="416" height="327" /></a></p><p>Here are the takeaways -</p><ul><li><span
style="color: #35383d;">ESXTOP is a powerful utility for performance monitoring</span></li><li><span
style="color: #35383d;">All stats (-a) can result in a huge file &#8211; use it wisely in batch mode; else use interactive mode to select your counters and write them to the user-defined configuration file.  Invoke the config file with the -c option when running in batch mode.</span></li><li><span
style="color: #35383d;">Consider using vscsiStats for more granular reporting.</span></li><li><span
style="color: #35383d;">ESXTOP physical disk stats do not include NFS volumes.</span></li></ul><p>Do you use other tools or methods to collect basic disk IO counters for storage sizing purposes?  If so, leave a comment describing your approach!</p><div
class="shr-publisher-192"></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div
class='shareaholic-like-buttonset' style='float:none;height:30px;'><a
class='shareaholic-fblike' data-shr_layout='button_count' data-shr_showfaces='false' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2009%2F09%2Fesxtop-batch-mode-windows-perfmon%2F' data-shr_title='ESXTOP+Batch+Mode+%26amp%3B+Windows+Perfmon'></a><a
class='shareaholic-fbsend' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2009%2F09%2Fesxtop-batch-mode-windows-perfmon%2F'></a><a
class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fvmtoday.com%2F2009%2F09%2Fesxtop-batch-mode-windows-perfmon%2F' data-shr_title='ESXTOP+Batch+Mode+%26amp%3B+Windows+Perfmon'></a></div><div
style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div>]]></content:encoded> <wfw:commentRss>http://vmtoday.com/2009/09/esxtop-batch-mode-windows-perfmon/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Database Caching 13/53 queries in 0.332 seconds using disk: basic
Object Caching 1118/1184 objects using disk: basic
Content Delivery Network via Amazon Web Services: CloudFront: cloudfront.vmtoday.com

Served from: vmtoday.com @ 2012-02-08 18:51:22 -->
