Thursday, June 7, 2012

RAID Level Configuration

What would you do when the hard drive failed?

When I was asked the question, the first thing come to my mind intuitively, is to replace that hard drive the database / system still up? Before replacing the hard drive, or even when first managing the server, it is important to know its RAID configuration.

RAID stands for Redundant Array of Independent Disks. It is a storage technology to combines at least two hard drives into a logical unit to increase IO performance or/and to improve data reliability. Three main concepts are used to reach this goal,

- Mirroring
- Striping
- Error Correction (parity)

There are typically four different RAID levels implemented on server. RAID 0, RAID 1, RAID 5 and RAID 10.

RAID 0 is known as disk striping, but without parity or mirroring. Data is split up into multiple blocks (stripe) and saved on different drive. As multiple drives (minimum of 2 drives) are used simultaneously, it increase IO performance for read and write operations. However, since it is not fault tolerant, a single drive failure could result the entire array lost.


RAID 1 is known as disk mirroring, without striping and parity. Data written to primary disk is copied to the mirror disk to create an identical mirror set. Since the secondary disk is a mirror of the primary disk, the total effective storage capacity is half of the total storage capacity. RAID 1 provide fault tolerance and improved read performance as either drive could be read at the same time. Write performance is slower but comparable as single disk drive.


RAID 5 is known as striping with distributed parity, but without mirroring. It provides striping like RAID 0 with the addition of parity across all the disks. The data and parity are distributed on different disks. At least 3 disk drives are required. Assuming similar sized hard drives are used, the total effective storage capacity is 1 disk drive less than the total storage capacity as 1 disk drive equivalent of storage is used for the distributed parity. RAID 5 provides better read performance than RAID 1 (performance goes up with drive count), but a disk failure will decrease its read performance.  Hardware RAID 5 implementation could also allow hot swap/plug, replacing a damaging array while the server is up. Update - other hardware RAID implementation could also allow hot swap. Check with your vendor, and also if the operating system support it.


RAID 10 (1+0) is known for mirroring with striping. It provides advantages of both RAID 1 and RAID 0. It implement the system with redundancy through mirroring on a set of disk drives, and using striping on all mirror set. Minimum of 4 disk drives are required. RAID 10 provide the highest read and write performance but require two times as many disks.


Conclusion
RAID 0 provide the best performance with striping. Read and write are faster as the operations are spread across multiple disks and performed independently. However, it should be only used if redundancy or possible loss of data is not an issue. RAID 1 provides high performance and redundancy. It is suitable for critical storage with relatively smaller capacity requirement (2 disk drive) with fault tolerance. RAID 5 provides high read and write performance with high data protection. RAID 5 offers best cost performance ratio and is ideal for large storage capacity as its cost increases slower than RAID. If budget is not the concern, RAID 10 is suitable for large storage capacity with the highest performance.

Part 2 - RAID Controller

So, with all these hardware redundancy and fault tolerance, do we still need backup? What's next?

Google+