Storage Basics – Part I: An Introduction

StorageBasics1I am increasingly finding that both my SMB and Enterprise customers are uneducated on the fundamentals of storage sizing and performance.  As a result, storage is often overlooked as a performance bottleneck despite it being a vital component to consider in a virtualization implementation.  Storage will only increase in importance as hosts are getting bigger, data volumes increase, and more workloads are virtualized.  For some reason, most people can grasp the importance of CPU and memory performance constraints but storage performance is often overlooked and can be hard to explain to business users or executives.

Case in point – I have recently been called into some environments that were not performing well – these environments happened to be running Microsoft SQL, but could just have well been running any application or collection of virtual machines.  Fingers were being pointed in all directions: at applications, at the virtualization layer, at a lack of memory, and DBA’s were insisting that there were too few CPU’s.  The situation was getting political and emotional when I walked into it.  A few minutes with Windows Perfmon was all I needed to identify storage performance as the root cause of the firestorm that had been ignited.  Using a bit of data, I was able to turn the discussion from an emotional fight to a simple problem of physics and mathematics (and a bit of simple math could have avoided the problem in the first place).

I have seen this play out a few too many times and so decided to write-up this multi-part series on the basics of storage with a focus on storage performance.  That said, a little math and physics is where we will start as we look at the basic building block of a storage environment: a hard disk drive.  Wikipedia defines a hard disk drive as “a non-volatile storage device that stores digitally encoded data on rapidly rotating platters with magnetic surfaces.” Your computer, server, or VMware cluster uses hard drives to read and write data.  Wikipedia also covers the history and atomic structure of a hard drive pretty well.  For our purposes, the take away is that hard drives are physical objects, and as such, follow the laws of physics (duh) in the following measurable ways:

1.) Capacity, which is measured in bits or bytes and exponents there of (MB, GB, TB, PB).  This is how much data will fit on your disk, from simple text files to virtual disks, and everything in between.  For example, if you have a 500GB SQL database, you darn well better have a hard drive that has a capacity of at least 500GB.  This is a pretty simple concept, so I’ll leave it there for now.

2.) Performance, which is measured in a couple ways:

- at the disk itself in Input-Output Per Second (IOPS) – a measure of how many read and write commands a disk can complete in a second

- interface throughput, measured in MBps or Gbps – a measure of the peak rate that a volume of data can be read from or written to disk

- latency – the amount of time between when you ask a disk (or storage system if you want to read ahead) to do something and when it can actually do it, very closely related to IOPS as you’ll read in a forthcoming article in this series.

Each disk, array, and storage system has its own fixed set of measurements given a specific configuration.  Knowing the physical capabilities of your storage system as measured in the above ways, and your systems storage requirements will go a long way towards a successful design and implementation of your storage environment.  The remaining parts of this series will take a look at these performance characteristics a bit more in-depth and explain what happens as you introduce factors like RAID, cache, data reduction techniques such as snapshots and deduplication, and varying workloads.

Please keep in mind that while I have designed and implemented a variety of DAS, NAS, and SAN technologies from a host of vendors including Dell, EMC, IBM, and NetApp, I am by no means a storage expert.  The information I will provide is generalized, over-simplified, and does not consider varying approaches from different storage vendors.  Nonetheless, I hope you find this useful information if you are designing a solution, troubleshooting a performance issue or preparing to make a storage purchase.

Keep Reading:

Comments

  1. Sadly I know of several outsource/consultants around here that would argue that you don’t need fast disks or large vol/aggr (depending on vendor) to get good performance. I argued with my now ex-boss about why 20 7200 speed disks were worse then 20 15k or even 10k disks.. and he still said ahh they wont see any issues.. some people are just dumb..
    Now had the SAN solution been for file storage I would have said no problem with using 7200K speed disks.. But this was for VMs, which included Exchange, Sql and oracle! Still my boss insisted it wouldn’t matter..

    • John – Amazing what a little data will do! I’ve had the same arguments, even when vendors come in and use their own tools to gather performance stats. Managers argue that the vendors skew their results to sell more disk. That’s why I have worked hard on understanding who, what, where, when and how to capture and analyze performance statistics on my own (and write about it here). It’s hard to argue with the raw data….

      15k disks are not always the answer, but don’t go buying a solution without asking questions and preparing with some good data!

  2. Sadly I know of several outsource/consultants around here that would argue that you don’t need fast disks or large vol/aggr (depending on vendor) to get good performance. I argued with my now ex-boss about why 20 7200 speed disks were worse then 20 15k or even 10k disks.. and he still said ahh they wont see any issues.. some people are just dumb..
    Now had the SAN solution been for file storage I would have said no problem with using 7200K speed disks.. But this was for VMs, which included Exchange, Sql and oracle! Still my boss insisted it wouldn’t matter..

    • John – Amazing what a little data will do! I’ve had the same arguments, even when vendors come in and use their own tools to gather performance stats. Managers argue that the vendors skew their results to sell more disk. That’s why I have worked hard on understanding who, what, where, when and how to capture and analyze performance statistics on my own (and write about it here). It’s hard to argue with the raw data….

      15k disks are not always the answer, but don’t go buying a solution without asking questions and preparing with some good data!

Trackbacks

  1. […] This post was mentioned on Twitter by tscalzott, joshuatownsend. joshuatownsend said: New VMtoday.com post: Storage Basics – Part I: An Introduction http://cli.gs/E568B #vmware […]

  2. […] This post was mentioned on Twitter by tscalzott, joshuatownsend. joshuatownsend said: New VMtoday.com post: Storage Basics – Part I: An Introduction http://cli.gs/E568B #vmware […]

  3. […] This post was Twitted by PlanetV12n […]

  4. […] This post was Twitted by PlanetV12n […]

  5. […] Part I of this series, I discussed the important of storage performance in a virtual environment (really any environment, […]

  6. […] Part I of this series, I discussed the important of storage performance in a virtual environment (really any environment, […]

  7. […] on storage basics.  I’ve had some good feedback from folks in the SMB space saying that the first couple posts in this series have been beneficial, so we’ll be sticking with some basic […]

  8. […] on storage basics.  I’ve had some good feedback from folks in the SMB space saying that the first couple posts in this series have been beneficial, so we’ll be sticking with some basic […]

  9. IOPS « Virtual Persuasion says:

    […] http://vmtoday.com/2009/12/storage-basics-part-i-intro/ Categories: Uncategorized Comments (0) Trackbacks (0) Leave a comment Trackback […]

  10. IOPS « Virtual Persuasion says:

    […] http://vmtoday.com/2009/12/storage-basics-part-i-intro/ Categories: Uncategorized Comments (0) Trackbacks (0) Leave a comment Trackback […]

  11. […] parts I, II, and III of the Storage Basics series we looked at the basic building blocks of modern storage […]

  12. […] parts I, II, and III of the Storage Basics series we looked at the basic building blocks of modern storage […]

  13. […] for Deployment (VMware) vSphere Storage: Features and Enhancements (Professional VMware) Storage Basics – Part I: An Introduction (VM Today) Storage Basics – Part II: IOPS (VM Today) Storage Basics – Part III: RAID (VM Today) […]

  14. […] for Deployment (VMware) vSphere Storage: Features and Enhancements (Professional VMware) Storage Basics – Part I: An Introduction (VM Today) Storage Basics – Part II: IOPS (VM Today) Storage Basics – Part III: RAID (VM Today) […]

  15. VCAP5-DCA study notes: Objective 1.1 « vKnowledge.net says:

    […] http://vmtoday.com/2009/12/storage-basics-part-i-intro/ […]

  16. VCAP5-DCA study notes: Objective 1.1 « vKnowledge.net says:

    […] http://vmtoday.com/2009/12/storage-basics-part-i-intro/ […]

  17. VMWARE TECHNICAL BEST PRACTICE DOCUMENT « Logeshkumar Marudhamuthu says:

    […] for Deployment (VMware) vSphere Storage: Features and Enhancements (Professional VMware) Storage Basics – Part I: An Introduction (VM Today) Storage Basics – Part II: IOPS (VM Today) Storage Basics – Part III: RAID […]

  18. VMWARE TECHNICAL BEST PRACTICE DOCUMENT « Logeshkumar Marudhamuthu says:

    […] for Deployment (VMware) vSphere Storage: Features and Enhancements (Professional VMware) Storage Basics – Part I: An Introduction (VM Today) Storage Basics – Part II: IOPS (VM Today) Storage Basics – Part III: RAID […]

  19. […] Storage Basics – Part I: An Introduction […]

  20. […] Storage Basics – Part I: An Introduction […]

Drop a comment below:

%d bloggers like this: