In Part I of this series, I discussed the important of storage performance in a virtual environment (really any environment, virtual or not, where you want acceptable performance), and introduced some of the basic measures of a storage environment. In Part II, we will look more closely at what may be the most important storage design consideration in a VMware server-consolidation enviornments, many SQL environments, and VDI environments to name a few: IOPS.
If we stick with a single-disk-centric approach as we did in Part I, IOPS is quite simply a measure of how many read and write commands a disk can complete in a second. IOPS is an important measure of performance in a shared storage environment (such as VMware) and in high-transaction-rate workloads like SQL. Because hard drives are forced to abide by the laws of physics, the IOPS capabilities of a disk are consistent and predictable given a specific configuration. The formula for calculating IOPS for a given disk is pretty straight forward (please show your work):
IOPS = 1000/(Seek Latency + Rotational Latency)
Exact latencies vary by disk type, quality, number of platters, etc. You can look up the tech specs for most drives on the market. As an example, I have randomly chosen the technical specifications of the Seagate Cheatah 15k.7 SAS drive. This particular drive has the following performance characteristics:
– Average (rotational) latency: 2.0msec
– Average read seek (latency): 3.4msec
– Average write seek (latency): 3.9msec
Using the read latency number, the math works out like this:
———- = 185 maximum read IOPS
The maximum write IOPS will be a bit less (~169IOPS) because of the higher write seek latency. Writing is more ‘expensive’ than reading and therefore slower.
Fortunately, there are some widely accepted ‘working’ numbers, so you do not have to use this formula for each and every disk you might consider using. Because rotational latency is based on the rotational speed, we can use the published Rotations Per Minute (RPM) rating of the drive to guess-timate the IOPS capabilities. Typical spindle speeds (measured in RPM) and their equivalent IOPS are in the table below.
SSD 2500 – 6000
While not a traditional spinning disk, I have also included Solid State Disks (SSD’s) for reference as SSD’s are starting to see increased market adoption. I have seen a wide range of sizing IOPS for SSD depending on the technology, type (SLC, MLC, etc.) Check out https://en.wikipedia.org/wiki/Solid-state_drive for an introduction, and ask your vendors for more in-depth technical information.
If you are brand-new to this (and you are still reading, congrats!), you can see how many IOPS your Windows computer is asking for by opening Performance Monitor and looking at the ‘Disk Transfers/sec’ counter under Physical Disk. This is a sum of the ‘Disk Reads/sec’ and ‘Disk Writes/sec’ counters as you can see in the screenshot below:
If you are after some stats for your VMware ESX environment, check out esxtop and looking for CMDS/s in the output. I published a couple articles on using esxtop here and here. The numbers from PerfMon and esxtop get you pretty close but can be skewed by a few things we’ll discuss in later posts.
Now that was fun and all, but let’s get real: Single-disk configurations are uncommon in servers. As such, we’ll part ways with our Simple Jack single disk approach to storage and begin to look at more real-world multi-disk enterprise-class storage configurations. A discussion of IOPS in a multi-disk array is a great way to start. From a very elementary perspective, you can combine multiple hard drives together to aggregate their performance capabilities. For example, two 15k RPM disks working together to server a workload could provide a theoretical 360 IOPS (180 + 180). This also scales out so ten 15k RPM disks could provide 1800 IOPS, and 100 15k RPM disks could provide 18,000 IOPS.
Designing your environment so that your storage can deliver sufficient IOPS to the requesting workload is of utmost importance. If you are working on a storage design, arm yourself with data from perfmon, top, iostat, esxtop, and vscsiStats. I typically gather at least 24 hours of performance data from systems under normal conditions (a few days to a week may be good if you have varying business cycles) and take the 95th percentile as a starting point. So from a very simple approach, if your data and calculations show a 1800 IOPS demand at the 95th percentile, you ought to have at least ten 15k RPM disks (or twenty-three 7.2k RPM SATA disks) to achieve performance goals. It’s amazing how some simple data and a pretty little Excel spreadsheet can help you understand and justify the right hardware for the job.
Now before you go and start filling out that PO form for a nice new storage system based on these numbers there are a few more things we ought to discuss. RAID, cache, and advanced storage technologies will skew these numbers and need to be understood. Stay tuned to future articles in this series for more on those topics and more.
Finally, there has been a bunch of activity in the VMware ecosystem of vendors, bloggers, and twittering-type-folks around storage performance. As this here post sat in my drafts folder, Duncan Epping posted this gem of an article that pretty much included all of the content of this article, as well as future ones in my series: https://www.yellow-bricks.com/2009/12/23/iops/. Do yourself a favor and read his post and the comments from his readers – both are filled with a ton of great information, including some vendor-specific implementations.
I was led to Duncan’s article by a post by Chad Sakac on his blog: https://virtualgeek.typepad.com/virtual_geek/2009/12/whats-what-in-vmware-view-and-vdi-land.html. This is also a great read that covers some of the same information with a focus on VMware View/VDI and is also worth a few minutes of your time. Also check out https://vpivot.com/2009/09/18/storage-is-the-problem/ for a rubber-meets-the-road post from Scott Drummonds on the importance of storage performance vis-a-vis IOPS in a VMware-virtualized SQL environment.
- Storage Basics – Part I: An Introduction
- Storage Basics – Part II: IOPS
- Storage Basics – Part III: RAID
- Storage Basics – Part IV: Interface
- Storage Basics – Part V: Controllers, Cache and Coalescing
- Storage Basics – Part VI: Storage Workload Characterization
- Storage Basics – Part VII: Storage Alignment
- Storage Basics – Part VIII: The Difference in Consumer vs. Enterprise Class Disks and Storage Arrays; or ‘Why is the SAN you are proposing so darn expensive?’
- Storage Basics – Part IX: Alternate IOPS Formula
You miss out something regarding the block size and behavior which will significant impact the IOPS from time to time. Just my 2 cents
Joshua Townsend says
Craig – good observation. Block size/workload characterization comes in the next installment as I move from the basics into real-world scenarios. Thanks for reading! -Josh
Josh – As a SMB VMware customer, this series really helped fill in some gaps in my understanding of storage.
The Yellow Bricks article was helpful too. Please keep up the great work!
Thanks Josh for an excellent series of posts, they have been most informative. Do I understand correctly that this ideal physical limit to IOPS applies when all I/O operations take place only on single sectors of the hard drive?
Any I/O greater than 512 bytes will mean additional seek time to complete that operation, as the heads will need to be repositioned. One would then also consider the transfer rate limitations of the storage unit
Must say these storage basics are very useful + Duncun has his own way of simplying things . thanks for the post .
Carsten anker says
You say look at perfmon to see how many iops Windows is asking for.
Isn’t that number infact what it gets BASED on what the the underlying storage Will provide?