Archive for the ‘Storage’ Category
I am finishing up an installation of an EMC Clariion CX4 SAN. One of the final steps of the installation is to configure PowerPath/VE on the ESXi hosts. PowerPath/VE is EMC’s multipathing extension module for VMware (and Hyper-V), designed to replace the Native Multipathing Plugin (NMP) for increased I/O performance and failover management. To simplify and automate the installation of PowerPath/VE, I decided to use VMware Update Manager (VUM) to push the extension to the ESXi 4.x hosts in the environment.
The process of setting up an additional VUM patch repository to host PowerPath/VE (and other 3rd party extensions such as the Cisco Nexus 1000v) is pretty straight forward. 3rd party extensions are supported in VUM beginning with vSphere 4.0 Update 1. Chad Sakac has posted a great video guide on YouTube that covers the setup:
I opted to use the tomcat installation on the environment’s vCenter server to host the PowerPath/VE repository. To accomplish this, I simply created a new directory in the tomcat root directory. The default path for the root directory on a vSphere vCenter Server is “C:\Program Files\VMware\Infrastructure\tomcat\webapps” (or C:\Program Files (x86)\VMware\Infrastructure\tomcat\webapps on a 64-bit installation).
I created a directory named ‘depot’ and within that directory created a PowerPathVE folder. I extracted the contents of the VUM folder from the PowerPath .zip file that I downloaded from http://powerlink.emc.com. A screenshot of the directory is below:
After creating the directory for the patch repository, I simply added an Extension Repository to VMware Update Manager as Chad shows in his video. I would like to call out one caveat – Because vCenter may not listen on standard HTTP/HTTPS ports, I used
https://vcenter.domain.local:8443/depot/PowerPathVE/index.xml as the path to the source.
Once PowerPath was added to an Extension Baseline in VUM, I simply had to scan my hosts for updates and remediate. Installation of PowerPath/VE requires the host to be in Maintenance Mode and concludes with a reboot. Pretty simple.
Then all you have to do is fight through an overly-complex licensing setup (seriously, a 112 page PDF on how to install licenses???), a bit of configuration, and you are multi-pathing with the best of them. If you are interested in learning more about PowerPath/VE, start with this whitepaper: EMC PowerPath/VE for VMware vSphere Best Practices Planning. For a bit of real-world insight into the performance increase you might see with PowerPath/VE, check out this blog post from Eric Sloof: Massive I/O power increase using EMC PowerPath/VE.
We all know that virtualization allows us to do more with less. Fewer servers and space-saving storage (talk about an oxymoron) help us put some green in the datacenter and back in the budget. But with tight budgets demanding greater efficiency, virtualization pushing per-U-space utilization higher, and increasingly rack-dense equipment, proper planning of your physical plant remains an essential part of IT. I argue that right-sizing your power, cooling, and floor-space is more critical now than it has ever been, and is a knowing how to do it is a darn good skill for a virtualization engineer to possess.
So along those lines… I was just doing some site-prep work for a new Clariion installation and noticed that the EMC Power Calculator has been updated. It is now a pretty slick little web app that can be found on the PowerLink site (login required) here: https://powerlink.emc.com/nsepn/webapps/powercalculator/Main.aspx.
While I am at it, here are some links to other power consumption calculators. Let me know if you have others and I will update this post:
- Dell: http://www.dell.com/calc
- IBM: http://www-03.ibm.com/systems/bladecenter/resources/powerconfig/index.html
- NetApp: Storage Efficiency Calculator here - http://www.secalc.com – it doesn’t calculate your consumption, just what you might save over a competitor’s offering.
- HP: http://h30099.www3.hp.com/configurator/powercalcs.asp
- Sun: http://www.sun.com/solutions/eco_innovation/powercalculators.jsp
- Hitachi/HDS: http://www.byhitachi.com/se/go/weight-and-power-calculator/
- APC: http://www.apc.com/prod_docs/results.cfm?DocType=Trade-Off%20Tool&Query_Type=10 and http://www.apcc.com/products/runtime_for_extendedruntime.cfm?upsfamily=165
- Emerson: Efficiency Calculator: http://www.emerson.com/edc/Calculator/default.aspx
- VMware ROI Calculator: http://vmware.com/go/calculator
- This site has a bunch of links to other calculators and resources: http://thegreenandvirtualdatacenter.com/calculator.html
There’s some fun and timely chatter happening right now on Twitter around power consumption and sizing – join in by following me at http://twitter.com/joshuatownsend/!
In Part I of this series, I discussed the important of storage performance in a virtual environment (really any environment, virtual or not, where you want acceptable performance), and introduced some of the basic measures of a storage environment. In Part II, we will look more closely at what may be the most important storage design consideration in a VMware server-consolidation enviornments, many SQL environments, and VDI environments to name a few: IOPS.
If we stick with a single-disk-centric approach as we did in Part I, IOPS is quite simply a measure of how many read and write commands a disk can complete in a second. IOPS is an important measure of performance in a shared storage environment (such as VMware) and in high-transaction-rate workloads like SQL. Because hard drives are forced to abide by the laws of physics, the IOPS capabilities of a disk are consistent and predictable given a specific configuration. The formula for calculating IOPS for a given disk is pretty straight forward (please show your work):
IOPS = 1000/(Seek Latency + Rotational Latency)
Exact latencies vary by disk type, quality, number of platters, etc. You can look up the tech specs for most drives on the market. As an example, I have randomly chosen the technical specifications of the Seagate Cheatah 15k.7 SAS drive. This particular drive has the following performance characteristics:
- Average (rotational) latency: 2.0msec
- Average read seek (latency): 3.4msec
- Average write seek (latency): 3.9msec
Using the read latency number, the math works out like this:
1000
———- = 185 maximum read IOPS
2.0+3.4
The maximum write IOPS will be a bit less (~169IOPS) because of the higher write seek latency. Writing is more ‘expensive’ than reading and therefore slower.
Fortunately, there are some widely accepted ‘working’ numbers, so you do not have to use this formula for each and every disk you might consider using. Because rotational latency is based on the rotational speed, we can use the published Rotations Per Minute (RPM) rating of the drive to guess-timate the IOPS capabilities. Typical spindle speeds (measured in RPM) and their equivalent IOPS are in the table below.
RPM………IOPS
7,200 80
10,000 130
15,000 180
SSD 2500 – 6000
While not a traditional spinning disk, I have also included Solid State Disks (SSD’s) for reference as SSD’s are starting to see increased market adoption. I have seen a wide range of sizing IOPS for SSD depending on the technology, type (SLC, MLC, etc.) Check out http://en.wikipedia.org/wiki/Solid-state_drive for an introduction, and ask your vendors for more in-depth technical information.
If you are brand-new to this (and you are still reading, congrats!), you can see how many IOPS your Windows computer is asking for by opening Performance Monitor and looking at the ‘Disk Transfers/sec’ counter under Physical Disk. This is a sum of the ‘Disk Reads/sec’ and ‘Disk Writes/sec’ counters as you can see in the screenshot below:
If you are after some stats for your VMware ESX environment, check out esxtop and looking for CMDS/s in the output. I published a couple articles on using esxtop here and here. The numbers from PerfMon and esxtop get you pretty close but can be skewed by a few things we’ll discuss in later posts.
Now that was fun and all, but let’s get real: Single-disk configurations are uncommon in servers. As such, we’ll part ways with our Simple Jack single disk approach to storage and begin to look at more real-world multi-disk enterprise-class storage configurations. A discussion of IOPS in a multi-disk array is a great way to start. From a very elementary perspective, you can combine multiple hard drives together to aggregate their performance capabilities. For example, two 15k RPM disks working together to server a workload could provide a theoretical 360 IOPS (180 + 180). This also scales out so ten 15k RPM disks could provide 1800 IOPS, and 100 15k RPM disks could provide 18,000 IOPS.
Designing your environment so that your storage can deliver sufficient IOPS to the requesting workload is of utmost importance. If you are working on a storage design, arm yourself with data from perfmon, top, iostat, esxtop, and vscsiStats. I typically gather at least 24 hours of performance data from systems under normal conditions (a few days to a week may be good if you have varying business cycles) and take the 95th percentile as a starting point. So from a very simple approach, if your data and calculations show a 1800 IOPS demand at the 95th percentile, you ought to have at least ten 15k RPM disks (or twenty-three 7.2k RPM SATA disks) to achieve performance goals. It’s amazing how some simple data and a pretty little Excel spreadsheet can help you understand and justify the right hardware for the job.
Now before you go and start filling out that PO form for a nice new storage system based on these numbers there are a few more things we ought to discuss. RAID, cache, and advanced storage technologies will skew these numbers and need to be understood. Stay tuned to future articles in this series for more on those topics and more.
Finally, there has been a bunch of activity in the VMware ecosystem of vendors, bloggers, and twittering-type-folks around storage performance. As this here post sat in my drafts folder, Duncan Epping posted this gem of an article that pretty much included all of the content of this article, as well as future ones in my series: http://www.yellow-bricks.com/2009/12/23/iops/. Do yourself a favor and read his post and the comments from his readers – both are filled with a ton of great information, including some vendor-specific implementations.
I was led to Duncan’s article by a post by Chad Sakac on his blog: http://virtualgeek.typepad.com/virtual_geek/2009/12/whats-what-in-vmware-view-and-vdi-land.html. This is also a great read that covers some of the same information with a focus on VMware View/VDI and is also worth a few minutes of your time. Also check out http://vpivot.com/2009/09/18/storage-is-the-problem/ for a rubber-meets-the-road post from Scott Drummonds on the importance of storage performance vis-a-vis IOPS in a VMware-virtualized SQL environment.
I am increasingly finding that both my SMB and Enterprise customers are uneducated on the fundamentals of storage sizing and performance. As a result, storage is often overlooked as a performance bottleneck despite it being a vital component to consider in a virtualization implementation. Storage will only increase in importance as hosts are getting bigger, data volumes increase, and more workloads are virtualized. For some reason, most people can grasp the importance of CPU and memory performance constraints but storage performance is often overlooked and can be hard to explain to business users or executives.
Case in point – I have recently been called into some environments that were not performing well – these environments happened to be running Microsoft SQL, but could just have well been running any application or collection of virtual machines. Fingers were being pointed in all directions: at applications, at the virtualization layer, at a lack of memory, and DBA’s were insisting that there were too few CPU’s. The situation was getting political and emotional when I walked into it. A few minutes with Windows Perfmon was all I needed to identify storage performance as the root cause of the firestorm that had been ignited. Using a bit of data, I was able to turn the discussion from an emotional fight to a simple problem of physics and mathematics (and a bit of simple math could have avoided the problem in the first place).
I have seen this play out a few too many times and so decided to write-up this multi-part series on the basics of storage with a focus on storage performance. That said, a little math and physics is where we will start as we look at the basic building block of a storage environment: a hard disk drive. Wikipedia defines a hard disk drive as “a non-volatile storage device that stores digitally encoded data on rapidly rotating platters with magnetic surfaces.” Your computer, server, or VMware cluster uses hard drives to read and write data. Wikipedia also covers the history and atomic structure of a hard drive pretty well. For our purposes, the take away is that hard drives are physical objects, and as such, follow the laws of physics (duh) in the following measurable ways:
1.) Capacity, which is measured in bits or bytes and exponents there of (MB, GB, TB, PB). This is how much data will fit on your disk, from simple text files to virtual disks, and everything in between. For example, if you have a 500GB SQL database, you darn well better have a hard drive that has a capacity of at least 500GB. This is a pretty simple concept, so I’ll leave it there for now.
2.) Performance, which is measured in a couple ways:
- at the disk itself in Input-Output Per Second (IOPS) – a measure of how many read and write commands a disk can complete in a second
- interface throughput, measured in MBps or Gbps – a measure of the peak rate that a volume of data can be read from or written to disk
- latency – the amount of time between when you ask a disk (or storage system if you want to read ahead) to do something and when it can actually do it, very closely related to IOPS as you’ll read in a forthcoming article in this series.
Each disk, array, and storage system has its own fixed set of measurements given a specific configuration. Knowing the physical capabilities of your storage system as measured in the above ways, and your systems storage requirements will go a long way towards a successful design and implementation of your storage environment. The remaining parts of this series will take a look at these performance characteristics a bit more in-depth and explain what happens as you introduce factors like RAID, cache, data reduction techniques such as snapshots and deduplication, and varying workloads.
Please keep in mind that while I have designed and implemented a variety of DAS, NAS, and SAN technologies from a host of vendors including Dell, EMC, IBM, and NetApp, I am by no means a storage expert. The information I will provide is generalized, over-simplified, and does not consider varying approaches from different storage vendors. Nonetheless, I hope you find this useful information if you are designing a solution, troubleshooting a performance issue or preparing to make a storage purchase.
Keep Reading:
I have been pulling my hair out with a small VI3 implementation running against an IBM DS3300 iSCSI array. Performance, for lack of a better term, sucked. Granted, the DS3300 is not an enterprise level workhorse of a storage system, but it fit the budget. Read performance was decent from the array, but write performance was terrible, maxing out at 10Mpbs throughput and insanely high latencies on long writes when the system was under load. This led to some long P2V operations, poor guest performance, and some questions from the project sponsors on why I couldn’t make the environment sing.
The system was configured with a single controller with dual GigE NIC’s. The controller had 512MB of battery backed cache (there is also a 1GB cache upgrade option available). I wrote off some of the poor performance to a single controller with a less-than-optimal amount of cache; blamed the SAS controller to SATA disk command translation overhead; cringed at the 6 disk RAID5 configuration; and engaged in some self doubting. I convinced the powers that be that we were IO constrained and got some funds to fill out the 3U chassis to a full 12 SATA disks, and reconfigured the array as a RAID10. Performance gains were almost unnoticeable with these changes. In addition, I did some basic troubleshooting of the network environment, verifying multiple paths to the storage, setting Flow Control on the switches to receive only, and double-checked my iSCSI initiator settings. Note: The DS3300 is only supported with the ESX software initiator. I found documentation on the DS3300 to be lacking, but did discover that the Dell MD3000i is based on the same LSI Engenio array. Some Googling on the Dell solution led to to the ‘SMcli’ command line interface for both arrays. The commands are slighly different for the Dell and IBM. The links to the IBM CLI documentation were broken, so I had to do a bit of trial and error to get the commands right. I used the Dell documentation as a starting point. (Rant: Seriously, IBM? Can you make your documentation any harder to get through – is it a Redbook, is it an Engineering Whitepaper, is it a support document, is it a case study – and why can I only find these with complex Google searches, not on your own product pages, and why can’t you name for documents intelligently, not with some random string of characters).
Moving on… I received an automated alert from the DS3300 about an incomplete battery learn cycle. Using the IBM Storage Manager GUI I generated a Storage Subsystem Profile’ from the Support tab to check the battery status. In the profile I discovered that while write cache was enabled, it had a status of “Enabled (Suspended)”. Ah ha! Now I’ve got some decent Google material that led me to this: http://communities.vmware.com/thread/195838. Hot damn I love the VMware Community Forums!
It turns out that in a single-controller configuration the setting for cache mirroring remains enabled by default. Because there is no 2nd controller to mirror to, the array suspends write caching. This is probably a safety thing – loss of high availability on the controllers puts data in cache at risk should the only controller fail. I weighed my options and decided that the poor performance I was experiencing beat HA concerns, so I enabled write cache on the array using this command:
c:\program files\ibm_ds4000\client>smcli -n <ARRAYNAME> -c “set allLogicalDrives mirrorEnabled=false;”
And then followed with this for good measure:
c:\program files\ibm_ds4000\client>smcli -n <ARRAYNAME> -p <arraypassword> -c “set allLogicalDrives writeCacheEnabled=true;”
The results were immediately noticeable:
The screen shot is from Veeam Monitor Free Edition, taken during 4 concurrent V2V operations from Hyper-V to VMware. With the write cache fully functional, disk usage peaked at 54MBps, latency dropped to about 6ms, and my blood pressure dropped a few notches.
While poking around the CLI I also found that you can dump performance stats from the array (performance is otherwise hard to find on the thing) using this command:
C:\Program Files\IBM_DS4000\client>smcli -n <ARRAYNAME> -c “set session performanceMonitorInterval=5 performanceMonitorIterations=120;save storageSubsystem performanceStats file=\”c:\\ds3300perfstats.csv\“;”
This will give you a 10 minute record of performance from the array which you can analyze using Excel. The Dell Enterprise Center TechCenter Wiki has a great write-up on how to efficiently analyze the data from this command here: http://www.delltechcenter.com/page/MD3000i+Performance+Monitoring, complete with a YouTube video that walks you through the process:
I am beginning to think that the DS3300 (and MD3000i) may actually be a viable starter solution for SMB’s starting out on a virtualization project. But I would recommend the cache upgrade, 2nd controller, SAS disks instead of SATA to eliminate the SAS-to-SATA translation overhead and more faster disks instead of fewer slower disks so you can drive throughput and IOPS to a higher level.
Have any of you deployed the DS3300 or MD3000i (or the generic LSI solution)? Do you have any performance tuning tips for these arrays? If so, share in the comments!





