Today there was some confusion from a client about RAID so I thought the least I could do is post defintions. It’s difficult to explain without a real case but at least the definitions explain how many drives you will require to provide redundancy or a version of redundancy.
RAID 0 (striping)
RAID 0 uses the read/write capabilities of two or more hard drives working in unison to maximize the storage performance of a computer system. Data in a RAID 0 volume is arranged into blocks that are interleaved among the disks so that reads and writes can be performed in parallel (see below diagram). This technique, known as “striping”, is the fastest of all of the RAID levels, especially for reading and writing large sequential files. Real world tasks where RAID 0 can be of particular benefit include loading large files into image editing applications, saving large movie files in a video editing application, or creating CD or DVD images with a CD/DVD authoring package.
The hard drives in a RAID 0 volume are combined to form one volume which appears as a single virtual drive to the operating system. For example, two 400 GB hard drives in a RAID 0 array will appear as a single 800 GB hard drive to the operating system.
No redundancy information is stored in a RAID 0 volume. This means that if one hard drive fails, all data on both drives is lost. This lack of redundancy is also reflected by the RAID level 0, which indicates no redundancy. RAID 0 is not recommended for use in servers or other environments where data redundancy is a primary goal.
RAID 1 (mirroring)
A RAID 1 array contains two hard drives where the data between the two is mirrored in real time. Because all of the data is duplicated, the operating system treats the usable space of a RAID 1 array as the maximum size of one hard drive in the array. For example, two 400 GB hard drives in a RAID 1 array will appear as a single 400 GB hard drive to the operating system.
The primary benefit of RAID 1 mirroring is that it provides good data reliability in the case of a single disk failure. When one disk drive fails, all data is immediately available on the other without any impact to the data integrity. In the case of a disk failure, the computer system will remain fully operational to ensure maximum productivity.
The performance of a RAID 1 array is greater than that of a single drive because data can be read from multiple disks – the original and the mirror – simultaneously. Disk writes do not realize the same benefit because data must first be written to one drive, then mirrored to the other.
RAID 5 (striping with parity)
A RAID 5 array contains three or more hard drives where the data is divided into manageable blocks called strips. Parity is a mathematical method for recreating data that was lost from a single drive, which increases fault-tolerance. The data and parity are striped across all the hard drives in the array. The parity is striped in a rotating sequence to reduce bottlenecks associated with the parity calculations.
The capacity of a RAID 5 array is the size of the smallest drive multiplied by one less than the number of drives in the array. The equivalent of only a single hard drive is used to store the parity information, allowing for fault-tolerance with less than the 50% capacity reduction of RAID 1. For example, three 400 GB hard drives in a RAID 5 array will appear as a single 800 GB hard drive to the operating system.
The primary benefits of RAID 5 include capacity and data protection. Because parity is used for data protection, up to 75% of the total drive capacity is usable. Further, any single drive can fail and it is possible to rebuild the data after replacing the failed hard drive with a new drive. However, the extra work of calculating the missing data will degrade the write performance to the RAID 5 volume while the volume is being rebuilt.
The read performance of a RAID 5 array is greater than that of a single drive because data can be read from multiple disks simultaneously. Disk writes do not realize the same benefit because parity must be calculated and written to all the drives.
RAID 10
A RAID 10 array uses four hard drives to create a combination of RAID levels 0 and 1 by forming a RAID 0 array from two RAID 1 arrays.
Since all of the data on the RAID 0 array is duplicated, the capacity of a RAID 10 array is the size of the RAID 0 array. For example, four 400 GB hard drives in a RAID 10 array will appear as a single 800 GB hard drive to the operating system.
The primary benefit of RAID 10 is that it combines the benefits of RAID 0 performance and RAID 1 fault-tolerance. It provides good data reliability in the case of a single drive failure. When one hard drive fails, all data is immediately available from the other half of the mirror without any impact to the data integrity. In the case of a disk failure, the computer system will remain fully operational to ensure maximum productivity. Data fault-tolerance can be restored by replacing the failed drive.
The performance of a RAID 10 array is greater than that of a single drive since data can be read from multiple disks simultaneously. Compared to a two-disk RAID 0, RAID 10 read performance is higher as data can be read from either half of the mirror, but write performance is slightly lower due to ensuring data is written out completely to the array.