IBM DS3300 iSCSI Write Performance Solved

I have been pulling my hair out with a small VI3 implementation running against an IBM DS3300 iSCSI array.  Performance, for lack of a better term, sucked.  Granted, the DS3300 is not an enterprise level workhorse of a storage system, but it fit the budget.  Read performance was decent from the array, but write performance was terrible, maxing out at 10Mpbs throughput and insanely high latencies on long writes when the system was under load.  This led to some long P2V operations, poor guest performance, and some questions from the project sponsors on why I couldn’t make the environment sing.

The system was configured with a single controller with dual GigE NIC’s.  The controller had 512MB of battery backed cache (there is also a 1GB cache upgrade option available).  I wrote off some of the poor performance to a single controller with a less-than-optimal amount of cache; blamed the SAS controller to SATA disk command translation overhead; cringed at the 6 disk RAID5 configuration; and engaged in some self doubting.  I convinced the powers that be that we were IO constrained and got some funds to fill out the 3U chassis to a full 12 SATA disks, and reconfigured the array as a RAID10.  Performance gains were almost unnoticeable with these changes.  In addition, I did some basic troubleshooting of the network environment, verifying multiple paths to the storage, setting Flow Control on the switches to receive only, and double-checked my iSCSI initiator settings.  Note: The DS3300 is only supported with the ESX software initiator.  I found documentation on the DS3300 to be lacking, but did discover that the Dell MD3000i is based on the same LSI Engenio array.  Some Googling on the Dell solution led to to the ‘SMcli’ command line interface for both arrays.   The commands are slighly different for the Dell and IBM.  The links to the IBM CLI documentation were broken, so I had to do a bit of trial and error to get the commands right.  I used the Dell documentation as a starting point.  (Rant: Seriously, IBM?  Can you make your documentation any harder to get through – is it a Redbook, is it an Engineering Whitepaper, is it a support document, is it a case study – and why can I only find these with complex Google searches, not on your own product pages, and why can’t you name for documents intelligently, not with some random string of characters).

Update: The IBM System Storage DS3000, DS4000, and DS5000Command Line Interface and Script Commands Programming Guide is here: IBM System Storage DS3000, DS4000, and DS5000Command Line Interface and Script Commands Programming Guide – DS3k4k5kCLIreference, SMCLI

Moving on… I received an automated alert from the DS3300 about an incomplete battery learn cycle.  Using the IBM Storage Manager GUI I generated a  Storage Subsystem Profile’ from the Support tab to check the battery status.  In the profile I discovered that while write cache was enabled, it had a status of “Enabled (Suspended)”.   Ah ha!  Now I’ve got some decent Google material that led me to this: http://communities.vmware.com/thread/195838.  Hot damn I love the VMware Community Forums!

It turns out that in a single-controller configuration the setting for cache mirroring remains enabled by default.  Because there is no 2nd controller to mirror to, the array suspends write caching.  This is probably a safety thing – loss of high availability on the controllers puts data in cache at risk should the only controller fail.  I weighed my options and decided that the poor performance I was experiencing beat HA concerns, so I enabled write cache on the array using this command:

c:program filesibm_ds4000client>smcli -n <ARRAYNAME> -c “set allLogicalDrives mirrorEnabled=false;”

And then followed with this for good measure:

c:program filesibm_ds4000client>smcli -n <ARRAYNAME> -p <arraypassword> -c “set allLogicalDrives writeCacheEnabled=true;”

The results were immediately noticeable:

DS3300 Performance Improvement when Write Cache is Enabled

The screen shot is from Veeam Monitor Free Edition, taken during 4 concurrent V2V operations from Hyper-V to VMware.  With the write cache fully functional, disk usage peaked at 54MBps, latency dropped to about 6ms, and my blood pressure dropped a few notches.

While poking around the CLI I also found that you can dump performance stats from the array (performance is otherwise hard to find on the thing) using this command:

C:Program FilesIBM_DS4000client>smcli -n <ARRAYNAME> -c “set session performanceMonitorInterval=5 performanceMonitorIterations=120;save storageSubsystem performanceStats file=”c:ds3300perfstats.csv“;”

This will give you a 10 minute record of performance from the array which you can analyze using Excel.  The Dell Enterprise Center TechCenter Wiki has a great write-up on how to efficiently analyze the data from this command here: http://www.delltechcenter.com/page/MD3000i+Performance+Monitoring, complete with a YouTube video that walks you through the process:

I am beginning to think that the DS3300 (and MD3000i) may actually be a viable starter solution for SMB’s starting out on a virtualization project.  But I would recommend the cache upgrade, 2nd controller, SAS disks instead of SATA to eliminate the SAS-to-SATA translation overhead and more faster disks instead of fewer slower disks so you can drive throughput and IOPS to a higher level.

Have any of you deployed the DS3300 or MD3000i (or the generic LSI solution)?  Do you have any performance tuning tips for these arrays?  If so, share in the comments!

Comments

  1. Vladimir says:

    Hi Joshua! You wrote a perfect article! It has very much helped me.

  2. Switchgott says:

    Hi,
    thanks for your great articel,
    but what about performance problem with dual controller?

    • @Switchgott – There are several areas for you to consider in your troubleshooting:

      1.) Have you reached the max performance of your unit and workload? That is, with your current disk type, disk count, and RAID configuration, have you reached max load with your application profile (random read, sequential write, etc.)? If so, consider changing disk type, adding disks, or changing RAID type to better match your requirements. Use the SMCLI to capture performance stats and compare what you see to industry standard published numbers for IOPS under specific workloads.

      2.) Write cache could still be disabled on your dual-controller unit. Use the SMCLI to determine if write caching is suspended – this could happen with an incomplete batter learn cycle, for example.

      3.) Do you have a configuration error? Perhaps Jumbo Frames are enabled on the array, but not through the rest of the architecture (network switches, servers, etc.). Poor quality switches, oversubscribed switches, incorrect flow control settings, poor quality iSCSI initiators, multi-pathing errors, etc. could all cause problems on your system.

      I hope this helps – feel free to post back a reply if you have more questions and I’ll do my best to help.

      Josh

  3. Mark Breaux says:

    Do you set flow control on the switch, vSphere 4, and the IBM DS3300?

    Thanks,
    Mark

    • Mark – I’m working from memory here, but as I recall, best practice is to set flow control to Rx only on the switch. The DS3300 should detect that the switch is receiving and will auto-set to Tx (and I think Rx but I couldn’t find the documentation on this tonight). Flow control on ESX should be auto-negotiated by default, so it too should also transmit. Hope this helps. Feel free to post back with additional questions or any answers that you dig up in your own research.

      Josh

  4. You’re gambling with your data integrity if you enable write cache on a single controller model this way, if any component fails on the single controller you’ve told the OS that the data has been committed but it’s only in cache. That’s OK if the cache can be moved to a replacement controller without disconnecting the battery since the replacement controller can then commit it to the disks but the DS3300 cache can’t be transported with the battery attached AFAIK.

    I agree about IBM’s documentation, you can only find the info if you call it a FastT rather than DS3300.

    • Right you are, Andy! There certainly is risk in over-riding the default setting in a single-controller configuration. I weighed the risk of corruption against performance and determined that in my specific use case the risk was acceptable. I encourage all readers to weigh the risk in their environment before making the changes I highlighted.

      If the risk is unacceptable, you should add a 2nd controller (DS3300 Controller Upgrade 1726 HC3 Feature code: 4925) or accept slower-than expected writes. If you are upgrading, you may also consider upgrading the 512MB cache in each controller to 1GB (DS3000 1GB Cache Memory Upgrade Feature Code: 4838 [$929.00 each]). All available parts/upgrades for the DS3300 can be found here: http://www-01.ibm.com/cgi-bin/common/ssi/ssialias?infotype=an&subtype=ca&htmlfid=897/ENUS107-517&appname=isource&language=enus

  5. You’re gambling with your data integrity if you enable write cache on a single controller model this way, if any component fails on the single controller you’ve told the OS that the data has been committed but it’s only in cache. That’s OK if the cache can be moved to a replacement controller without disconnecting the battery since the replacement controller can then commit it to the disks but the DS3300 cache can’t be transported with the battery attached AFAIK.

    I agree about IBM’s documentation, you can only find the info if you call it a FastT rather than DS3300.

  6. As an ex-IBMer and current IBM Business Partner I would have to agree with you on the documentation aspect of finding things within IBM.com. Thought I might share this link with you which I often share with my customers and others which can make it a bit easier to find what you are looking for within IBM.

    http://www.ibmquicklinks.com/

    The IBM website is just sooo huge and the search functionality very hit or miss, but this is a good aggregation of main links.
    Also, I’d recommend contacting a skilled IBM Business Partner or two when you run into issues like this, as they are usually more eager to consult customers like yourselves and help them through issues which they’ve more than likely run into multiple times in the past and may have an easy answer to based on experience. We often do this for no fee to show our value add and hopefully earn your business in the future.
    While trolling the net to try and find answer may be fun at times, it’s probably not the most efficient use of your time. Let me do the trolling for you, if I don’t have the answer already : )

  7. Firstly this is a great article! I have followed your article and executed the commands via the SMCli. I ran SQLIO tests before and after and actually notice quite a difference. My question is that I have only 1 controller (512Mb cache) on my DS3300 and I lose that controller, i.e it fails, how can the cache be written to the disk if the controller is unavailable? I am finding it hard to see how having the setting at the default is that much of a risk? I have redundancy at the RAID level which is fine but surely just purchasing the one controller exposes you to some risk anyhow? From what you have written here, because there is no 2nd controller to mirror to, the array suspends write caching, well that’s obvious?

    I’m failing to see why you would want to enable mirrorenabled when you have no second controller?

    Cheers

    • Jerry –

      Good questions! RAID will only protect that data that is already on disk, not data that is in the cache waiting to be written to disk. The risk comes in when you are writing some changes to disk (say to a SQL DB) and only some of the blocks are written to disk, the remaining writes are still in cache. If the single controller dies, the data in cache does not get flushed to disk and the SQL DB is inconsistent/corrupt. The controllers in the DS3300 have battery-backed cache, so a power loss to the array should trigger not be an issue as the contents of the cache will be held as long as the battery can support, and written to disk once power is restored and disks are spinning. The big risk is a catastrophic failure of the controller, but in that case you probably have bigger issues to worry about (like rebuilding your RAID sets) and/or a DR situation.

      In a test/dev environment, I personally see no problem with disabling cache mirroring to enable write caching. In a production environment, I would think twice before accepting the risk, make sure I have good monitoring, and argue long and hard for a second controller for proper redundancy.

  8. Firstly this is a great article! I have followed your article and executed the commands via the SMCli. I ran SQLIO tests before and after and actually notice quite a difference. My question is that I have only 1 controller (512Mb cache) on my DS3300 and I lose that controller, i.e it fails, how can the cache be written to the disk if the controller is unavailable? I am finding it hard to see how having the setting at the default is that much of a risk? I have redundancy at the RAID level which is fine but surely just purchasing the one controller exposes you to some risk anyhow? From what you have written here, because there is no 2nd controller to mirror to, the array suspends write caching, well that’s obvious?

    I’m failing to see why you would want to enable mirrorenabled when you have no second controller?

    Cheers

    • Jerry –

      Good questions! RAID will only protect that data that is already on disk, not data that is in the cache waiting to be written to disk. The risk comes in when you are writing some changes to disk (say to a SQL DB) and only some of the blocks are written to disk, the remaining writes are still in cache. If the single controller dies, the data in cache does not get flushed to disk and the SQL DB is inconsistent/corrupt. The controllers in the DS3300 have battery-backed cache, so a power loss to the array should trigger not be an issue as the contents of the cache will be held as long as the battery can support, and written to disk once power is restored and disks are spinning. The big risk is a catastrophic failure of the controller, but in that case you probably have bigger issues to worry about (like rebuilding your RAID sets) and/or a DR situation.

      In a test/dev environment, I personally see no problem with disabling cache mirroring to enable write caching. In a production environment, I would think twice before accepting the risk, make sure I have good monitoring, and argue long and hard for a second controller for proper redundancy.

  9. Leandro Cruz (Brazil_ says:

    Josh,

    This simply saved my life! When I checked my MD3000i settings I realized the some LUNs had the status below:

    Read cache: Enabled
    Write cache: Enabled (currently suspended)

    I had to set writeCacheEnabled to FALSE before setting to TRUE. The performance changed dramatically on the fly. I had a heavy Oracle write operation going on and svctm (iostat -xm 2) dropped from 20ms to near 0ms. Bandwidth went from 79Mbps to an avg of 150Mbps. The overall application performance is now at least 10 times better.

    Cheers,

    Leandro Cruz

  10. Leandro Cruz (Brazil_ says:

    Josh,

    This simply saved my life! When I checked my MD3000i settings I realized the some LUNs had the status below:

    Read cache: Enabled
    Write cache: Enabled (currently suspended)

    I had to set writeCacheEnabled to FALSE before setting to TRUE. The performance changed dramatically on the fly. I had a heavy Oracle write operation going on and svctm (iostat -xm 2) dropped from 20ms to near 0ms. Bandwidth went from 79Mbps to an avg of 150Mbps. The overall application performance is now at least 10 times better.

    Cheers,

    Leandro Cruz

  11. Hi Joshua,
    Very nice post.
    What is your recomendation for a dual controller setup?
    I have dual 512MB controllers and I’m getting ~30MBps on average with high latency.
    I have 12 disks in raid 6 and ~30 VMs.

    Thanks

  12. Hi Joshua,
    Very nice post.
    What is your recomendation for a dual controller setup?
    I have dual 512MB controllers and I’m getting ~30MBps on average with high latency.
    I have 12 disks in raid 6 and ~30 VMs.

    Thanks

  13. Pradeep Goonetillake says:

    Thanks a lot for this article. One of my customer’s had a performance issue when deleting snapshots(using Veeam) on a IBM DS3300. I was pulling my hair out to see why this was happening. He was having a single controller unit. I could fix the problem by disabling the write suspension.

    Thanks

    Pradeep

  14. Pradeep Goonetillake says:

    Thanks a lot for this article. One of my customer’s had a performance issue when deleting snapshots(using Veeam) on a IBM DS3300. I was pulling my hair out to see why this was happening. He was having a single controller unit. I could fix the problem by disabling the write suspension.

    Thanks

    Pradeep

  15. This Article was very helpful to solve exactly the same problem on my DS3300 single controller storage as write speed turn from 7 Mbps to 60Mbps
    Even IBM official support used this article to figure out this issue

    I would like to add couple of notes for future readers
    The write cache can only be enable if the battery state is in full health
    As I experience the same problem with the battery as Joshua (incomplete battery learn cycle error) only after I replace the battery and wait 24 hours to charge it
    I was able to run the script successfully

    First run:
    set allLogicalDrives mirrorCacheEnabled=false

    Second run:
    set logicalDrive [“lun_name”] writeCacheEnabled=true;

    After scrip run setting should look like this

    Read cache: Enabled
    Write cache: Enabled
    Write cache without batteries: Disabled
    Write cache with mirroring: Disabled
    Flush write cache after (in seconds): 10.00
    Dynamic cache read prefetch: Enabled

  16. This Article was very helpful to solve exactly the same problem on my DS3300 single controller storage as write speed turn from 7 Mbps to 60Mbps
    Even IBM official support used this article to figure out this issue

    I would like to add couple of notes for future readers
    The write cache can only be enable if the battery state is in full health
    As I experience the same problem with the battery as Joshua (incomplete battery learn cycle error) only after I replace the battery and wait 24 hours to charge it
    I was able to run the script successfully

    First run:
    set allLogicalDrives mirrorCacheEnabled=false

    Second run:
    set logicalDrive [“lun_name”] writeCacheEnabled=true;

    After scrip run setting should look like this

    Read cache: Enabled
    Write cache: Enabled
    Write cache without batteries: Disabled
    Write cache with mirroring: Disabled
    Flush write cache after (in seconds): 10.00
    Dynamic cache read prefetch: Enabled

  17. Hi Joshua, im just wondering, with your config are you running different subnets for each controller, and have you got jumbo frames enabled ? i have been trying to find the best configuration and have managed some pretty high numbers, but im always on the lookout for what everyone else has set the units up like.

    Thanks
    Ant

  18. Hi Joshua, im just wondering, with your config are you running different subnets for each controller, and have you got jumbo frames enabled ? i have been trying to find the best configuration and have managed some pretty high numbers, but im always on the lookout for what everyone else has set the units up like.

    Thanks
    Ant

  19. Just wanted to say thanks for the Article – seems some of the ‘tuning’ that you suggest certainly helps in the small VMware environment I run. 2 Dell PowerEdge 1950 G3s connected through SAS to MD3000.

    For a small environment – it seems that SAS is easier to deal with, not another network to setup, no extra switches and the headaches that go with tuning the iSCSI settings.

    • Thanks for the comment, Andrew. You are correct that direct attached is easier, but it does not scale well. Once you get a few ESX/ESXi hosts and want to start doing vMotion, DRS, HA, etc. you have to get into SAN or NAS storage.

  20. Just wanted to say thanks for the Article – seems some of the ‘tuning’ that you suggest certainly helps in the small VMware environment I run. 2 Dell PowerEdge 1950 G3s connected through SAS to MD3000.

    For a small environment – it seems that SAS is easier to deal with, not another network to setup, no extra switches and the headaches that go with tuning the iSCSI settings.

    • Thanks for the comment, Andrew. You are correct that direct attached is easier, but it does not scale well. Once you get a few ESX/ESXi hosts and want to start doing vMotion, DRS, HA, etc. you have to get into SAN or NAS storage.

  21. New to the IBM DS3200, How do i get to see the settings on my unit? what is the command to show the settings that i would input in the script window?

  22. New to the IBM DS3200, How do i get to see the settings on my unit? what is the command to show the settings that i would input in the script window?

  23. Hi,

    having the same issue as described on a single controller DS3200 box. Except for the SAS controller, it should be exaclty the same hardware.
    However, in my case – all of the performance “tweaking” has no efect what-so-ever. As soon, as there’s a SATA disk (officially supported IBM disks ofcource) inside the unit, the write-performance drops to below 10Mbytes/sec. Unit has been in IBM now for about 2 weeks, and all they could come up with – is: “it seems that the controller of DS3200 is not able to handle SATA disks correcly and there’s nothing to doo about it. With SAS disks and 7200 krpm NL SAS disks it’s working fine”. Guess which disks we have… so beware everybody.

  24. Hi,

    having the same issue as described on a single controller DS3200 box. Except for the SAS controller, it should be exaclty the same hardware.
    However, in my case – all of the performance “tweaking” has no efect what-so-ever. As soon, as there’s a SATA disk (officially supported IBM disks ofcource) inside the unit, the write-performance drops to below 10Mbytes/sec. Unit has been in IBM now for about 2 weeks, and all they could come up with – is: “it seems that the controller of DS3200 is not able to handle SATA disks correcly and there’s nothing to doo about it. With SAS disks and 7200 krpm NL SAS disks it’s working fine”. Guess which disks we have… so beware everybody.

  25. Thanks a lot!!! It worked perfect for me!

  26. Thanks a lot!!! It worked perfect for me!

  27. Thankyou for this Joshua, it really helped me!

  28. Thankyou for this Joshua, it really helped me!

  29. What is the maximum read/write speed or IBM DS3300 running at iSCSI connection to host?

    • There are too many variables to give a fixed answer. It really depends on your workload (read vs. write, random vs. sequential, small vs. big), your multipathing setup, your use of cache, the type of disks you have, the number of disks you have, etc.

  30. What is the maximum read/write speed or IBM DS3300 running at iSCSI connection to host?

    • There are too many variables to give a fixed answer. It really depends on your workload (read vs. write, random vs. sequential, small vs. big), your multipathing setup, your use of cache, the type of disks you have, the number of disks you have, etc.

  31. Thanks much for this article Josh. It’s been a big help in monitoring and performance tuning our ds3300. I’m having a hard time finding the command line interface. Do you remember, was it a separate download, or packaged with something else? Do you happen to have the url for the download?

  32. Thanks much for this article Josh. It’s been a big help in monitoring and performance tuning our ds3300. I’m having a hard time finding the command line interface. Do you remember, was it a separate download, or packaged with something else? Do you happen to have the url for the download?

  33. Hi Josh, thanks for a great article. Do you happen to have the url where you downloaded the DS Command Line Interface? Is it a separate download or packaged with something else? I can’t seem to find it anywhere.

  34. Hi Josh, thanks for a great article. Do you happen to have the url where you downloaded the DS Command Line Interface? Is it a separate download or packaged with something else? I can’t seem to find it anywhere.

  35. Thank you a lot, Joshua!

    Your article helped me to solve this issue with single-controller DS3512 as well. Wonder, why IBM still don’t have this in their documentation.

    • Thanks, Dimitry. Glad this article is still helping people years after I wrote it. Would be nice if IBM, Dell and others would 1.) stop selling the arrays with single controllers, and 2.) document what happens when you go with a single controller.

  36. Thank you a lot, Joshua!

    Your article helped me to solve this issue with single-controller DS3512 as well. Wonder, why IBM still don’t have this in their documentation.

    • Thanks, Dimitry. Glad this article is still helping people years after I wrote it. Would be nice if IBM, Dell and others would 1.) stop selling the arrays with single controllers, and 2.) document what happens when you go with a single controller.

  37. Con los datos de SMcli.exe y Logparse.dll se puede llegar a crear una monitorización basica.
    With SMcli.exe data can get to create a basic monitoring.

  38. Con los datos de SMcli.exe y Logparse.dll se puede llegar a crear una monitorización basica.
    With SMcli.exe data can get to create a basic monitoring.

  39. Thanks, great article. I have an md3000i with the write cache in the suspended state after an incomplete battery learn cycle was reported on one of the controllers, though the batteries are now “optimal”, and just wanted to check if it’s safe to run these commands in a live environment.

    Your response would be much appreciated.

  40. Hi Joshua,

    Your article is still helping people! I actually already had all of the write-cache enabled, but it’s nice to have a copy of the commands in case a battery dies and disables the write cache.

    Do you know if there is a command to check the battery cache settings? I’m using Icinga (Nagios) to monitor my DS3300’s (http://exchange.nagios.org/directory/Plugins/Hardware/Storage-Systems/SAN-and-NAS/check_IBM_performance/details) and if there was a way to script checking if the write cache was enabled or not, WOW then I’d know right away if the cache got disabled and wouldn’t have to check for what’s causing poor performance.

    Thanks again for the post on this!

    Darhl

    • Darhl – glad it is still helpful. Good thought on using Icinga to automate this – I don’t have a DS3300 any more, so I’m afraid I can’t try it. If you come up with a solution feel free to share in the comments and maybe someone else can benefit!

      – Josh

Trackbacks

  1. […] of my most popular posts to date had been IBM DS3300 Write Performance Problem Solved.  I am pleased to have upgrade my internal environment to an EMC Clariion CX4 array, but still […]

  2. […] of my most popular posts to date had been IBM DS3300 Write Performance Problem Solved.  I am pleased to have upgrade my internal environment to an EMC Clariion CX4 array, but still […]

  3. […] My last post described a problem I experienced with VMware HA after upgrading to vSphere 4.1.  Here is my experience with a similar issue after applying the ESXi410-201010401-SG patch to one of my test/dev ESXi clusters.  The patch, released on November 15th and weighing in at a hefty 212MB, fixes a number of issues from Likewise authentication on ESXi hosts to allowing configurable NOOP timout and interval values for faster failover of certain iSCSI arrays (like the DS3300 or MD3000i). […]

  4. […] most popular post to date has been: IBM DS3300 iSCSI Write Performance Solved. I’m glad this has been useful for so many, but I hope that you don’t just apply the […]

  5. […] resulting in better performance as measured in both IOPS and throughput.  This graph from my article on troubleshooting write performance on an IBM DS3300 iSCSI array shows how throughput increased and latency decreased when enabling write cache.  The extent to […]

Drop a comment below:

%d bloggers like this: