Software RAID in Linux

From lxadm | Linux administration tips, tutorials, HOWTOs and articles
Jump to: navigation, search

Software RAID in Linux - overview[edit]

This article focuses on managing software RAID level 1 (RAID1) in Linux, but similar approach could be used to other RAID levels.

Software RAID in Linux we use can be managed with mdadm tool.

Devices used by RAID are /dev/mdX, X being the number of a RAID device, for example /dev/md0 or /dev/md1.

To list all devices in the system, including RAID devices, use fdisk:

# fdisk -l

Disk /dev/md0: 1044 MB, 1044512768 bytes
2 heads, 4 sectors/track, 255008 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table

Disk /dev/sda: 80.0 GB, 80032038912 bytes
255 heads, 63 sectors/track, 9730 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1         127     1020096   fd  Linux raid autodetect
/dev/sda2             128        9730    77136097+  fd  Linux raid autodetect

Disk /dev/sdb: 80.0 GB, 80032038912 bytes
255 heads, 63 sectors/track, 9730 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1         127     1020096   fd  Linux raid autodetect
/dev/sdb2             128        9730    77136097+  fd  Linux raid autodetect

Disk /dev/md1: 78.9 GB, 78987264000 bytes
2 heads, 4 sectors/track, 19284000 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

    Device Boot      Start         End      Blocks   Id  System


Warnings about the lack of a "valid partition table" are normal with swap on md devices.

Using mdadm tool[edit]

Viewing RAID devices[edit]

This one shows details for the device /dev/md0 - it has two RAID/active/working devices, both are active and are in sync:

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.01
  Creation Time : Wed Nov 30 20:42:26 2005
     Raid Level : raid1
     Array Size : 1020032 (996.29 MiB 1044.51 MB)
    Device Size : 1020032 (996.29 MiB 1044.51 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Dec  1 13:04:19 2005
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 131294eb:84dbaed1:e44abf9b:340c65a3
         Events : 0.65

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

Simulating hardware failure[edit]

This one shows details for the device /dev/md1 - it has two RAID devices, and only one of them is active and working. One RAID device is marked as removed - this was caused by a simulated hardware failure:

  • booting with only the first disk to see if RAID is configured properly,
  • booting with only the second disk to see if RAID is configured properly,
  • booting the server again with both disks.
# mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.01
  Creation Time : Wed Nov 30 20:42:26 2005
     Raid Level : raid1
     Array Size : 77136000 (73.56 GiB 78.99 GB)
    Device Size : 77136000 (73.56 GiB 78.99 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Thu Dec  1 14:25:12 2005
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : c36b5402:58ba0631:b2266f01:15bb8173
         Events : 0.19308

    Number   Major   Minor   RaidDevice State
       0       0        0        -      removed
       1       8        2        1      active sync   /dev/sda2

One RAID device is marked as "removed", because it is not in sync (is "older") with the other ("newer") device.


Recovering from a simulated hardware failure[edit]

This part is easy: just mark the device as faulty, remove it from the array, and then add it again - it will start to reconstruct.

Setting the device as faulty:

# mdadm /dev/md0 -f /dev/sda1
mdadm: set /dev/sda1 faulty in /dev/md0


Remove the device from the arrry:

# mdadm /dev/md0 -r /dev/sda1
mdadm: hot removed /dev/sda1


Add the device to the array:

# mdadm /dev/md0 -a /dev/sda1
mdadm: hot added /dev/sda1


Check what the device is doing:

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.01
  Creation Time : Wed Nov 30 20:42:26 2005
     Raid Level : raid1
     Array Size : 1020032 (996.29 MiB 1044.51 MB)
    Device Size : 1020032 (996.29 MiB 1044.51 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Dec  1 15:10:29 2005
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 38% complete

           UUID : 131294eb:84dbaed1:e44abf9b:340c65a3
         Events : 0.68

    Number   Major   Minor   RaidDevice State
       0       0        0        -      removed
       1       8       17        1      active sync   /dev/sdb1

       2       8        1        0      spare rebuilding   /dev/sda1

As we can see, it's being rebuilt - after that process is finished, both devices (dev/sda1 and /dev/sdb1) will be marked as "active sync".

You can see in /proc/mdstat how long this process will take and at what speed the reconstruction is progressing:

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[2] sdb1[1]
      1020032 blocks [2/1] [_U]
      [==>..................]  recovery = 13.0% (133760/1020032) finish=0.2min speed=66880K/sec
unused devices: <none>


Recovering from a real hardware failure[edit]

This process is similar to recovering from a "simulated failure":

To recover from a from a real hardware failure, do:

  • make sure that partitions on a new device are the same as on the old one:
    • create them with fdisk (fdisk -l will tell you what partitions you have on a good disk; remember to set the same start/end blocks, and to set partition's system id to "Linux raid autodetect")
    • consult /etc/mdadm.conf file, which describes which partitions are used for md devices
  • add a new device to the array:
# mdadm /dev/md0 -a /dev/sda1
mdadm: hot added /dev/sda1

Then, you can consult mdadm --detail /dev/md0 and/or /proc/mdstat to see how long the reconstruction will take.

Make sure you run lilo when the reconstruction is complete - see below.

RAID boot CD-ROM[edit]

It's always a good idea to have a CD-ROM, from which you can always boot your system (in case lilo was removed etc.).

It can be created with mkbootdisk tool:

# mkbootdisk --iso --device /root/raid-boot.iso `uname -r`

Then, just burn the created ISO.


If everything fails[edit]

If everything fails - the system doesn't boot from any of the disks nor from the CD-ROM, you have to know that you can easily "see" files on RAID devices (at least on RAID1 devices) - just insert any Live Linux distribution, and boot the system - you should see the files on normal /dev/sdX partitions - you can copy the files to the remote system for example with scp.

You can manually assemble a RAID device using commands below:

modprobe raid1
modprobe dm-mod
mdadm --assemble --verbose /dev/md1  /dev/sda2 /dev/sdb2

Installing lilo[edit]

You have to install lilo on all devices if you replaced the disks:

# lilo
Added linux *
Added failsafe
The boot record of  /dev/md1  has been updated.
The Master boot record of  /dev/sdb  has been updated.
The Master boot record of  /dev/sda  has been updated.


If lilo gives you a following error:

# lilo
Fatal: Trying to map files from unnamed device 0x0000 (NFS/RAID mirror down ?)

This may mean two things:

  • RAID is being rebuilt - check it with cat /proc/mdstat, and try again when it's finished.
  • Another is that the first device in the RAID array doesn't exist, such as when building a degraded array with only one device. If you stop the array and reassemble it so that the active device is first, lilo should start working again.


Example lilo.conf for RAID[edit]

default="linux"
boot=/dev/md1
map=/boot/map
keytable=/boot/us.klt
raid-extra-boot=mbr
menu-scheme=wb:bw:wb:bw
prompt
nowarn
timeout=30
message=/boot/message
image=/boot/vmlinuz
        label="linux"
        root=/dev/md1
        initrd=/boot/initrd.img
        append=" resume=/dev/md0"
        vga=791
image=/boot/vmlinuz
        label="failsafe"
        root=/dev/md1
        initrd=/boot/initrd.img
        append=" failsafe resume=/dev/md0"