Technology News

latest updates from easySERVICE™

The Impact of RAID on today’s Storage Architecture

Raid1Understanding data patterns and disk types is crucial when discussing storage design for specific applications. However, there are additional considerations. The RAID level/type must also be considered. The storage concept of “parity penalty” refers to the performance cost or performance impact of protecting data via RAID. This performance penalty only exists on writes. So, it is important to understand if the environment is write intensive or read intensive. Fortunately, most environments are the latter. These are the RAID protection parity penalties:

• RAID 0: ~0% overhead vs reads

• RAID 1+0: ~50% overhead vs reads

• RAID 5: ~75% overhead vs reads

• RAID 6: ~85% overhead vs reads

Parity penalty depends on the way a block of data is written by the RAID protection level. Keep in mind that generating parity bits for each stripe of data incurs overhead. These figures are only truly visible in random write scenarios. In a sequential write environment, the RAID controller cache helps mitigate the performance impact.

With these write overhead costs in mind, consider some best practices. SSD drives are designed for random workloads, so should typically be configured in a RAID 1+0 to maximize performance (unless an environment is 100% read). SAS drives are also aimed at performance.

Therefore, RAID 1+0 or RAID 5 should be used. SATA drives are aimed at capacity with throughput, and due to their huge capacities should be configured in RAID 6. RAID 6 also provides additional security and peace of mind during rebuilds for backup applications where SATA drives are preferred. Note: RAID 5 may be considered when using 2TB drives or smaller. RAID 1+0 may also be considered in very high-scale virtual infrastructures of 2,000+ virtual machines.

Best practices when sizing individual RAID arrays within Data storage systems is to keep RAID arrays between 5 and 15 disks per array. Performance will suffer in RAID arrays larger than 15 disks as stripes grow too large. Arrays with fewer than 5 disks will not have enough spindles to provide good performance. Finally, the layout of volumes/LUNs on multiple RAID sets should be considered. Deploying a simple structure on one LUN per RAID set will reduce the risk of disk-based contention. However, this is not always possible with a RAID set between 5 and 15 disks.

When a one to one ratio is not possible, isolating LUN’s that host similar applications and data patterns to a RAID set will buy back some performance. Consider the following scenario: A LUN used for backup and a LUN used for SQL hosted on the same RAID set. The two applications with differing data patterns contend for disk resources, requiring the disks to perform additional seeks for the scattered, random data. Again, the end result is reduced performance.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


This entry was posted on April 3, 2014 by in Big Data, Data Storage, Server and tagged , , , , .
%d bloggers like this: