Monday, August 13, 2012

Raid and High-Availability linux solutions

I've been doing a lot of research lately on Linux solutions on Raid for custom/home built NAS solutions, and high availabilty for networked services. As usual, Linux has some of the best solutions on everything. In my research, i was able to test different solutions and technologies; btrfs, mdadm with LVM, glusterFS, drbd and heartbeat. I won't be exlpaining how to get up and running with these systems but will leave some of the references that i used to get up and going. In the end, i learned alot about raid in general and found mdadm with LVM to be the best solution for me. Btrfs was excellent in my tests but is still somewhat experimental and lack raid 5 support currently :(. Lets briefly get acquainted with these individual software systems.

Btrfs is an advanced filesystem that enables you to aggregate disks and partitions into usable raid arrays that can be quickly and easily mounted. The filesystem is capable of error detection and correction through check sums, snapshots and sub volume management for volume/storage management (as with LVM). As of today, btrfs is still in development and not considered fully stable. Here are some references to get you up and running. Link1, Link2.

Drbd is a mirroring technology that has the ability to sync two disks or partitions from two distinct machines. This is useful for setting up a backup/secondary server that should each have identical data. This is essentially raid 1 over IP. Here are some resources. Link1, Link2

Heartbeat is a high-availability solution that allows for system fail-over. What this means is that heartbeat will provide access to the resources on machine A and in the event that machine A goes down, it will notify machine B to wake up and take over the duties of machine A. This transfer of resources is seamless in most cases and there may be only a 2-6 seconds downtime depending on how things are configured. Here are some resources. Link1, Link2

GlusterFS is a rather unique filesystem that enables a network of machines to combine there storage resources (disks, partitions, folders) and add raid like capabilities with these storage devices. This is similar to drbd, however, i find it to be more flexible. While drbd operates at a lower level (block level), glusterFS is at the filesystem level and operates on the files. You can mirror folders to multiple peer machines (raid 1 over the network), stripe data (raid 0 over the network), combine the storage capacities of all peers and more. Resources: Link1, Link2

Mdadm is the traditional Linux software raid solution. Its been tried and tested and is still the goto solution amongst linux sys-admins. It has support for many raid types including, raid-0, raid-1, raid-5, raid-10 and more. It is common to use LVM for volume management on top of the mdadm raid solution. LVM allows for control and management of the storage pool, which makes it easy to grow or shrink volumes (which are like partitions on a hard drive), take snapshots, etc. Although I loved using btrfs, as it works great for a raid solution, it did lack raid 5 support at the time (mdadm has raid-5 support. Support for raid-5 is due out in a later kernel release for btrf). I found the combination of mdadm and lvm more involved in its setup than btrfs but still relatively easy. I spent quite sometime simulating failed disks in both raid 1 and 5. I was playing a video from the local system when i simulated a disk failure on one of the disks. While replacing the failed disk (still, we are simulating) and rebuilding the array, the video kept on playing with no lag/downtime.

Mdadm Resources: Link1, Link2, Link3
LVM resources: Link1, Link2, Link3