In parts I, II, and III of the Storage Basics series we looked at the basic building blocks of modern storage systems: hard disk drives. Specifically, we looked at the performance characteristics of disks in terms of IOPS and the impact of combining disks into RAID sets to improve performance and resiliency. Today we will have a quick look at another piece of the puzzle that impacts storage performance: the interface. The interface, for lack of a better term, can describe several things in a storage conversation. It can be let me break it down for you (remember, we’re keeping it simple here).
At the most basic level (assume a direct-attached setup), ‘interface’ can be used to describe the physical connections required to connect a hard drive to a system (motherboard/controller/array). The ‘interface’ extends beyond the disk itself, and includes the controller, cabling, and disk electronics necessary to facility communications between the processing unit and the storage device. Perhaps a better term for this would be ‘intra-connect’ as this is all relative to the storage bus. Common interfaces include IDE, SATA, SCSI, SAS, and FC. Before data reaches the disk platter (where it is bound by IOPS), it must pass through the interface. The standards bodies that define these interfaces go beyond the simple physical form factor; they also define the speed and capabilities of the interface, and this is where we find another measure of storage performance: throughput. The speed of the interface is the maximum sustained throughput (transfer speed) of the interface and is often measured in Gbps or MBps.
Here are the interface speeds for the most common storage interfaces:
Interface | Speed |
IDE | 100MBps or 133MBps |
SATA | 1.5Gbps, 3.0Gbps, 6.0Gbps |
SCSI | 160MBps (Ultra-160) and 320MBps (Ultra-320) |
SAS | 1.5Gbps, 3.0Gbps, 6.0Gbps |
FC | 1Gb, 2Gb, 4Gb, 8Gb or 16Gb (Duplex throughput rates are 200MBps, 400MBps, 800MBps and 1600MBps respectively) |
If we take these speeds at face value, we see that a 320MBps SCSI and a 2Gbps FC are not too different. If you dig a bit deeper you will soon find that simple speed ratings are not the end of the story. For example, FC throughput can be impacted by the length and type of cable (fiber channel can use twisted pair copper in addition to fiber optic cables). Also, topologies can limit speeds – serial connected topologies are more efficient than parallel on the SCSI side, and arbitrated loops can incur a penalty on the FC side. The specifications of each interface type also define capabilities such as the protocol that can be used, the number of devices allowed on a bus, and the command set that can be used in communications on a storage system. For example, SATA native command queuing (NCQ) can offer a performance increase over parallel ATA’s tagged command queuing with other factors held constant. Because of this, you might also see some performance implications of connecting a SATA drive to a SAS backplane, as the SAS backplane translates SAS commands to SATA.
If we move away from the direct-connect model, and into a shared storage environment that you might use in a VMware-virtualized environment, the ‘interface’ takes on an additional meaning. You certainly still have the bus ‘interface’ that connects your disks to a backplane. Modern arrays typically use SAS or FC backplanes. If you have multiple disk enclosures, you also have an interface that connects each disk shelf to the controller/head/storage processor, or to an adjacent tray of disks. For example, EMC Clariion’s use a copper fiber channel cable in a switched fabric to connect disk enclosures to the back-end of the storage processors.
If we move to the front-end of the storage system, ‘interface’ describes the medium and protocol used by initiating systems (servers) when connecting to the target SAN. Typical front-end interface mediums on a SAN are Fiber Channel (FC) and Ethernet. Front-end FC interfaces come in the standard 2Gb, 4Gb, or 8Gb speeds, while Ethernet is 1Gbps or 10Gbps. Many storage arrays support multiple front-end ports which can be aggregated for increased bandwidth, or targeted by connecting systems using multi-pathing software for increased concurrency and failover.
Various protocols can be sent over these mediums. VMware currently supports Fiber Channel Protocol (FCP) on FC, and iSCSI and NFS on Ethernet. FC and iSCSI are block-based protocols that utilize encapsulated SCSI commands. NFS is a NAS protocol. Fiber Channel over Ethernet (FCoE) is also available on several storage arrays, sending FCP packets across Ethernet.
Determining which interface to use on both the front-end and back-end of your storage environment requires an understanding of your workload and your desired performance levels. A post on workload characterization is coming in this series, so I won’t get too deep now. I will, however, provide a few rules of thumb. First, capture performance statistics: using Windows Perfmon, look at Physical Disk|Disk Read Bytes/sec or Disk Write Bytes/sec), or check out stats in your vSphere Client if you are already virtualized.
- If you require low latency, use fiber channel.
- If your throughput is regularly over 60MBps, you should consider fiber channel connected hosts.
- iSCSI or NFS are often a good fit for general VMware deployments.
There is a ton of guidance and performance numbers available when it comes to choosing the right interconnect for a VMWare deployment, and a ton of variables that impact performance. Start with this whitepaper from VMware: https://www.vmware.com/resources/techresources/10034. For follow up reading, check out Duncan Epping’s post with a link to a NetApp comparison of FC, iSCSI, and NFS: https://www.yellow-bricks.com/2010/01/07/fc-vs-nfs-vs-iscsi/. If you are going through a SAN purchase process, ask your vendor to assist you in collecting statistics for proper sizing of your environment. Storage vendors (and their resellers) have a few cool tools for collecting and analyzing statistics – don’t be afraid to ask questions on how they use those tools to recommend a configuration for you.
I’ve kept this series fairly simple. Next up in this series is a look at cache, controllers and coalescing. With the next post we’ll start to get a bit more complex and more specific to VMware and Tier 1 workloads, both virtual and physical. Thanks for reading!
Keep Reading:
- Storage Basics – Part I: An Introduction
- Storage Basics – Part II: IOPS
- Storage Basics – Part III: RAID
- Storage Basics – Part IV: Interface
- Storage Basics – Part V: Controllers, Cache and Coalescing
- Storage Basics – Part VI: Storage Workload Characterization
- Storage Basics – Part VII: Storage Alignment
- Storage Basics – Part VIII: The Difference in Consumer vs. Enterprise Class Disks and Storage Arrays; or ‘Why is the SAN you are proposing so darn expensive?’
- Storage Basics – Part IX: Alternate IOPS Formula