Understanding RAID and Erasure Coding

A deep dive into the two popular data protection schemes, what they are, their similarities and differences, and their advantages and disadvantages.

Securing data and ensuring its availability is a vital consideration in any organization. Data has become a vital resource in the modern world. data is at the center of everything from promoting better decision making to improving service delivery. thus, their is the need to use data protection and recovery measures to ensure the availability and integrity of data. Erasure coding and RAID are some of the popular data protection and recovery techniques.

Goal

The article will tackle the following:

  • The different RAID methods available in the market and which to use to fulfill your data recovery needs.
  • Erasure coding schemes.
  • The best method for a given hardware, architecture, performance needs, and available storage.

The technical terms in the article:

Technical termDefinition
RAIDA redundant array of disks.
ParityRefers to distributed information formed by adding bits to a data block to ensure they are even or odd. Adding the bits places checksums on the data, thus allowing a system to detect errors and determine what data was lost during data transmission or disk failure.
ParityRefers to distributed information formed by adding bits to a data block to ensure they are even or odd. Adding the bits places checksums on the data, thus allowing a system to detect errors and determine what data was lost during data transmission or disk failure.
Data backupThis is a copy of your system’s data that you can use to recover files.
MirroringCopying data from one disk onto another disk(s).
StrippingIt involves splitting logically linear data across multiple drives which facilitates concurrent access to data.
BlockThe logical space on each drive where data is stored.
Encoding/decodingThis is the process of converting a sequence of characters into a unique format to improve transmission or storage. On the other hand, decoding converts an encoded format back to its original form.
Encoding/decoding complexityIt’s the amount of time and resources required to encode and decode a message. The level of complexity is affected by the size of a message, the level of redundancy in the message, and the type of encoding and decoding algorithms utilized.

RAID levels

lars-kienle-IlxX7xnbRF8-unsplash.jpg The five major RAID levels are discussed below:

a. RAID 0

This level focuses on meeting performance needs rather than fault tolerance. Commonly known as stripping since it separates data into at least two drives in a stripping aggregation.

In a RAID 0 configuration, data is written in parallel to all disks in the array.

The main advantage of this level is its performance factor. This level’s efficiency increases when you add disks to the RAID stack. Also, a dedicated RAID controller also boosts performance.

The RAID controller allows a system to concurrently read and write data stored in different drives. The RAID 0 level is most suitable for people who need high data processing speeds.

For instance:

  • Gamers prefer the method as it offers an advantage of a few milliseconds in latency over their peers.
  • Multimedia companies prefer the method due to its concurrency ability when reading and writing data.
  • RAID 0 requires less storage. Thus, its users can store more data using the level. However, it does not offer redundancy since it is more of a performance alternative than a data backup method.

b. RAID 1

RAID 1 disregards performance and favors fault tolerance. The method is sometimes called the mirroring technique since data from one disk is copied to another disk. Thus, users can seamlessly access data from the alternative disk in case of crashes.

Users of a RAID 1 system can only use 50% of available storage on the system due to mirroring. The system offers a way to easily back up and recover data since the data does not have to be rebuilt from scratch.

Some people prefer RAID 1 due to its redundancy capabilities. Some of the people who like RAID 1 include those looking to promote data security, such as in accounting systems.

c. RAID 5

This is the most utilized level under RAID. The level guarantees the security and performance of user data. But, it has a high overall expense as it requires at least three drives to implement.

RAID 5 uses parity bits stored on one drive to guarantee data security. It also strips data across multiple drives to improve performance. In a RAID 5 implementation of four drives, a system will use three drives for storage while one drive will store parity bits.

RAID 5 is preferred by people looking to build well-performing and fault-tolerant systems. RAID 5 can tolerate breakdowns to an entire disk.

The method is also favorable for people who need medium-performance and high-storage systems.

A downside of the level is the data reconstruction process impacts performance. The RAID system utilizes parity bits which require significant computational power during data recovery.

d. RAID 6

This level uses two parity drives to improve data security. It is similar to RAID 5 in other aspects apart from the number of parity drives.

By design, a RAID 6 system can continue to operate even after two drives fail. although such an occurrence is rare in a practical scenario.

e. RAID 10

A RAID 10 system is a combination of RAID 1 and 0. The system aggregates the advantages of mirroring (RAID 0) and stripping (RAID 1) to produce a security-focused high-performance system.

In a RAID 10 implementation, at least four drives are required to handle the stripping and mirroring functionalities of the system.

The drives under a RAID 10 configuration are divided equally, whereby two drives handle stripping while the remaining ones handle mirroring.

An upside of the level is the RAID 10 configuration allows a system to offer upscaled performance through its simultaneous read and write operations. The level also offers better security and reduces the chances of data loss.

A downside of the level is it costs more to set up and maintain. Furthermore, it only offers half of the combined disk storage.

Since it offers both the advantages of RAID 1 and 0, the system is preferred in the following areas:

  • Low to medium workload environments.
  • Entry-level servers, blade servers, and external storage systems.
  • Gaming, music, and video editing.

Erasure coding schemes

markus-spiske-70Rir5vB96U-unsplash.jpg Erasure coding is an efficient method for data recovery and error correction. The different implementations of the method are known as schemes. All erasure coding schemes add redundant data to a message to reconstruct the information and identify errors.

Before selecting an erasure coding scheme, you should consider the following:

  • How many nodes do you intend to use with the scheme?
  • Anticipated failure rate.
  • How long do you intend to spend waiting for a node to rebuild?

Below are some of the erasure coding schemes:

a. Reed-Solomon erasure coding

This is perhaps the most popular scheme in erasure coding. Reed Solomon is the preferred option for protecting data against transmission errors.

The scheme encodes data using a polynomial before sending the data over a channel. The receiver can then use the polynomial to decode the data, correcting any errors that may have occurred during transmission.

The Reed Solomon technique is efficient in storing data, as well as encoding/decoding complexity. However, the scheme is not robust in correcting and identifying errors.

Most Reed Solomon systems can handle 4 or 5 errors before failing. This is because they use a relatively small number of parity symbols.

The method uses simple algorithms such as the Berlekamp-Massey algorithm, the Euclidean algorithm, and the Chinese Remainder Theorem.

Reed Solomon-based systems are best for:

  • Cloud storage vendors.
  • Data transmission and recovery experts.

b. XOR Erasure coding

The scheme is more efficient than Reed Solomon codes in identifying and correcting errors. An XOR E.C. system achieves high error detection and correction operations by using parity bits and parity checking.

Parity checking uses generated parity bits to check for errors during and after data transmission. The process begins with Exclusive-ORing (XORing). The high error detection ability of XOR E.C codes makes them very effective during data recovery.

Some of the areas where the scheme is applied include in the generation of bar and Q.R. code readers.

c. Product Matrix Erasure Coding

This scheme uses Reed Solomon and XOR codes to form a data recovery system that is efficient in data transmission, error detection, and correction.

The product matrix E. C scheme uses product coding to prevent data leaks. Product coding encodes data using a matrix that is multiplied by a vector. This multiplication creates data that is resistant to errors.

Other factors to consider when selecting a data recovery option

a. Hardware Type

jeshoots-com-sMKUYIasyDM-unsplash.jpg In this context, hardware type refers to the specific hard drive technology. For example, erasure coding is generally more efficient than RAID when using SSD storage.

Erasure coding encodes data into smaller chunks and then distributes those chunks across different storage devices, which helps improve data durability and availability.

On the other hand, RAID relies on mirroring or parity to protect data, which is less space-efficient.

b. Architecture

Erasure coding is a new technology preferred in the cloud and hyper-converged systems. It’s more efficient than RAID when it comes to managing storage overhead costs. It can also provide better protection against data loss.

Erasure coding breaks data into smaller pieces and encodes them using a scheme. The encoded data is then stored across different storage devices such as hyper-converged systems. In case of a data loss, the data can be reconstructed from the encoded files.

c. Available storage

All erasure coding schemes are characterized as storage efficient. To implement an erasure coding system, you need at least two drives. However, this isn’t the case for a RAID system.

A RAID 10 configuration requires a minimum of four drives to implement. Thus, a RAID configuration is costly and less storage efficient. Nonetheless, RAID is still a popular option due to its computational efficiency. Erasure coding requires high computational input to perform efficiently.

Conclusion

In this article, we discussed erasure coding and RAID implementations. There are noticeable differences that set these two methods apart.

For instance, RAID configurations are preferred in less computational systems. On the contrary, erasure coding will be preferred if you have a high computational system due to its space efficiency and data security.

Happy reading!