What You Need To Know About RAID

If you’ve ever wondered about RAID, and what it might mean for you, then guest writer Bryan Keller’s article will provide you with the answer – and then some.

image

What is RAID?

RAID is the acronym for either ‘redundant array of inexpensive disks’ or ‘redundant array of independent disks’.  When first conceived at UC Berkley the former was the actual term that was coined but the latter is more commonly used today intentionally to disassociate the technology from the word inexpensive and the perception that RAID somehow implies a low cost solution.

Why Use RAID?

RAID is a storage technology that provides increased data reliability through data redundancy.  This is achieved primarily by duplicating data across several storage drives in a configuration referred to as an array of disks.

How Many Different Types of RAID are There?

There are several different types of RAID array configurations each type being denoted by a single digit numeral, 0 through 6 (and various combinations thereof).  These types are commonly referred to as RAID ‘levels’.

What Distinguishes one RAID Level From Another?

The makeup of the different RAID levels is varying combinations of redundancy, spanning, mirroring and striping.

What is Redundancy?

Redundancy is the duplication of data onto more than one physical drive to increase fault tolerance.  If one physical drive in a redundant array fails no data is lost and there is an opportunity to replace the failed device.  As long an one drive is functional, data is secure.  However, in the case of a failure, the failed device must be replaced and the array rebuilt onto the new device.  In very large data structures this can sometimes take a great deal of time.  If there were to be a failure during the process of rebuilding the array all data would be lost.

What is Spanning?

Spanning is the configuration of two or more physical drives into one ‘logical’ drive.  The logical drive is treated exactly the same as a physical drive and will appear as just one device.  Spanning is used to increase the amount of storage capacity of an array.  As an example: if three 100 gigabyte hard drives are configured as one spanned array, the result would be one logical drive 300 gigabytes in size.

Spanning alone provides no redundancy or fault tolerance and it is commonly combined with mirroring.

What is Mirroring?

Mirroring is the duplication of data onto two or more drives simultaneously to create data redundancy and increase fault tolerance.  A mirrored array sacrifices half of its storage capacity to achieve a redundant status.  If two 100 gigabyte drives are mirrored the result is a single 100 gigabyte mirrored array.

What is Striping?

Striping is a bit more complex.  Striping is used to increase performance.  This increase in performance is achieved by splitting the read and write data down into ‘blocks’ and then writing or reading that data simultaneously onto two or more physical drives on the same sector of each respective drive.

In a simplified example imagine that you are writing 100 megabytes of data out to a striped array.  If you were to take that data and split it into two 50 megabyte chunks and then write both of those chunks simultaneously, one 50 megabyte chunk to drive (a) and the other 50 megabyte chunk to drive (b), you would theoretically half the time required to perform the process.  That, in essence, is the theory behind striping.

Striping provides a significant increase in performance but it is also the most dangerous of all the RAID levels when used alone.  Not only is there no redundancy but if either or any of the drives in a striped array fails, all of the data from the entire array is completely lost.

RAID Level 0

RAID Level 0 – (2 Drive Minimum – no Fault Tolerance) Block Level Striping without Parity or Mirroring:  Because this type of RAID offers no fault tolerance or redundancy it is technically not actually RAID.  Raid 0 offers the best performance of all the RAID levels.  Data is broken down into fragments called blocks and is then written to all drives in the ‘array’ simultaneously across what is called a ‘stripe’ (on the corresponding disks in the same sector).  When data is read it is broken down into smaller pieces which can be read in parallel thereby increasing bandwidth.  With RAID 0 if any drive fails all data is lost across the entire ‘array’.  Even at minimum the likelihood of a catastrophic loss is double that of a single drive without any RAID at all.  RAID 0 should never be used alone for critical data.

RAID Level 1

RAID Level 1 – (2 Drive Minimum – Data Redundancy) Mirroring:  In its simplest form RAID 1 simply duplicates data onto two different hard drives simultaneously, thereby providing data redundancy.  Data redundancy means that if either of the two hard drives fails for any reason no data will be lost as there is an exact duplicate or ‘mirrored set’ of the data on the other drive.  Data integrity is maintained as long as either of the two hard drives in the array is functioning.  In the event that one of the drives does fail it is simply swapped out for a new working drive.  The ‘array’ then ‘rebuilds’ itself by duplicating all of the data onto the new drive and recreating the ‘mirrored set’.  Data is, however, vulnerable while a rebuild is in progress.

RAID Level 5

RAID Level 5 – (3 Drive Minimum – Redundancy Through Parity) Block-Level Striping with Distributed Parity:  RAID 5 combines the increased speed of striping with redundancy through distributed parity.  In RAID 5 one drive out of the array will always be sacrificed to achieve redundancy.  In other words, when there are three 100 gigabyte drives present in a RAID five, the array will be 200 gigabytes in size.  However, by using distributed parity, the redundancy is spread across the entire array.  Therefore, if any one drive in a RAID 5 fails, data integrity is maintained and an opportunity exists to replace the failed device and rebuild the array.

RAID Level 6

RAID Level 6 – (4 Drive Minimum – Redundancy Through Parity) Block-Level Striping with Double Distributed Parity:  Very similar to RAID 5, RAID 6 builds on the security of RAID 5 by adding an additional level of redundancy.  In a RAID 6 up to two drives can fail and no data will be lost.  RAID 6 makes very large arrays possible, where the time it takes to rebuild the array after a drive failure can be quite lengthy.  In a RAID 5 scenario data would be vulnerable for far too long while the rebuild is in progress.  RAID 6 addresses this concern by adding an additional redundancy drive.  RAID 6 is the solution that should be used where data is extremely critical or high system availability is important.

RAID Level 1+0

RAID Level 1+0 – (2 Drive Minimum (though 4 are more commonly used) – Redundancy Through Mirroring) Mirrored Sets in a Striped Set:  Fault tolerance and increased performance.  This RAID level is a combination of RAID 1 (mirroring) and RAID 0 (striping).  RAID 1+0 can sustain multiple drive failures as long as no mirror loses all of its drives.

RAID Level 0+1

RAID Level 0+1 – (4 Disk Minimum; must be even number of drives – Redundancy Through Mirroring) Striped Sets Mirrored:  Here, a second striped set is created to mirror the first striped set.  In contrast to the 1+0, in RAID 0+1, all the drives in one mirror can fail without a data loss but if drives fail on both sides of the mirror everything on the entire array is lost.

There are also more combinations possible, but I will stop here.

Guest writer Bryan Keller:

I own a Computer Repair and Data Recovery business in San Antonio, TX, San Antonio Computer Repair. I spent 10 years in database development. I am now also providing Website Development, Hosting, and SEO services. We use the Joomla CMS.

Altogether, I have been involved in computer programming for over 30 years. I was a self-taught programmer back when the ‘Atari 800′ was all the rage! I had an Atari 800 with 16 kilobytes of ram and a 6502 8-bit processor that ran at 1.7 MHz, no hard drive and a 5 1/4 inch floppy disk that stored just 180 Kilobytes of data. Of course there was no internet but we had the dial in bulletin boards that we connected to at 300 baud.

If you found this article useful, why not subscribe to this Blog via RSS, or email? It’s easy; just click on this link and you’ll never miss another Tech Thoughts article.

4 Comments

Filed under Hard Drive, Operating Systems, RAID, Technicians Advise

4 responses to “What You Need To Know About RAID

  1. Bill/Bryan,
    Great article, RAID is interesting but for me I see little reason to use it. I’ve heard horror stories of people thinking their system was safe when in reality they had a RAID 0 set up and losing one drive meant they lost it all. Mirroring makes some sense but, I prefer to just do data backups and an occasional cloning just in case. I have a friend who set up 2 SSD’s in a RAID 0 and then had a couple of terabytes for data, its amazingly fast machine that he can afford to clone regularly, but he has thousands just in drives.
    I’m curious are motherboard RAID controllers good enough, or should a separate card be used? (In case I get the bug)
    Thanks as always.
    Mark

    • Hi Mark,

      I’ve heard that story, although I don’t know anyone personally that its happened to. Like you, I have no need at least at this point, for RAID. Cloning does the job adequately for me.

      I’ll leave it to Bryan to cover the controller issue.

      Best,

      Bill

  2. mark,

    Great question. Typically the onboard RAID is relatively weak and many times only supports a few of the RAID levels (0, 1, 1+0) hardly ever 5 or 6.

    I have used RAID on and off over the years for my personal system and it can be an expensive and sometimes frustrating way of securing data.

    With SATA at least the drives have come down drastically in price. I was paying $600 for 75 GIG SCSI hard drives not that long ago!

    For the average user RAID is probably a little over-kill. For a business or enterprise system however RAID might just be the best solution going.

    bryan

  3. Pingback: hard drive recovery