mdadm does not add spares to arrays on boot

Bug #252365 reported by JanCeuleers
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
initramfs-tools
Fix Released
Undecided
Unassigned
mdadm (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: mdadm

After rebooting my server, spares are not automatically re-inserted into my RAID5 and RAID1 arrays.

I was able to get the above to work correctly by explicitly telling mdadm which partitions to scan:

# extract from /etc/mdadm/mdadm.conf
#DEVICE partitions
DEVICE /dev/sd[abcdef][12]

I think that this should work properly also in the case of "DEVICE partitions".

Some information about my system below, followed by detailed disk and md configs.

root@via:~# apt-cache show mdadm
Package: mdadm
Priority: optional
Section: admin
Installed-Size: 612
Maintainer: Ubuntu Core Developers <email address hidden>
Original-Maintainer: Debian mdadm maintainers <email address hidden>
Architecture: i386
Version: 2.6.2-1ubuntu2
Replaces: mdctl
Depends: libc6 (>= 2.6), makedev, debconf (>= 0.5) | debconf-2.0, lsb-base (>= 3.1-6), udev (>= 113-0ubuntu1), initramfs-tools (>> 0.85eubuntu11), debconf (>= 1.4.72)
Recommends: mail-transport-agent, module-init-tools
Conflicts: mdctl (<< 0.7.2), raidtools2 (<< 1.00.3-12.1)
Filename: pool/main/m/mdadm/mdadm_2.6.2-1ubuntu2_i386.deb
Size: 219708
MD5sum: 67dc0977d5818218a6eead01ec52dee4
SHA1: adcf0faafc84a241668d6b6c3a0069f59dcab14c
SHA256: 8a44014791a370fd7ef1e5c7f161d042249388897cfabacb0b141682f203ff13
Description: tool to administer Linux MD arrays (software RAID)
 mdadm is a program that can be used to create, manage, and monitor MD
 arrays (e.g. software RAID, multipath devices).
 .
 This package automatically configures mdadm to assemble arrays during the
 system startup process. If not needed, this functionally can be disabled.
Bugs: mailto:<email address hidden>
Origin: Ubuntu

root@via:~# lsb_release -rd
Description: Ubuntu 7.10
Release: 7.10

root@via:~# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
#DEVICE partitions
DEVICE /dev/sd[abcdef][12]

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid1 num-devices=2 spares=1 UUID=19e69537:f7a6aec8:5a5f7576:3fc29e0d spare-group=swapgroup
ARRAY /dev/md1 level=raid1 num-devices=2 spares=1 UUID=5e728621:c8b356a8:01f8e270:f0e280cb spare-group=swapgroup
ARRAY /dev/md2 level=raid5 num-devices=4 spares=1 UUID=714e46f1:479268a7:895e209c:936fa570

# This file was auto-generated on Fri, 22 Jun 2007 19:12:10 +0000
# by mkconf $Id: mkconf 261 2006-11-09 13:32:35Z madduck $
MAILFROM <email address hidden>

root@via:~# sfdisk -l /dev/sda

Disk /dev/sda: 38913 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start End #cyls #blocks Id System
/dev/sda1 0+ 121 122- 979933+ fd Linux raid autodetect
/dev/sda2 122 38912 38791 311588707+ fd Linux raid autodetect
/dev/sda3 0 - 0 0 0 Empty
/dev/sda4 0 - 0 0 0 Empty

root@via:~# sfdisk -l /dev/sdb

Disk /dev/sdb: 38913 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start End #cyls #blocks Id System
/dev/sdb1 0+ 121 122- 979933+ fd Linux raid autodetect
/dev/sdb2 122 38912 38791 311588707+ fd Linux raid autodetect
/dev/sdb3 0 - 0 0 0 Empty
/dev/sdb4 0 - 0 0 0 Empty

root@via:~# sfdisk -l /dev/sdc

Disk /dev/sdc: 38913 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start End #cyls #blocks Id System
/dev/sdc1 0+ 121 122- 979933+ fd Linux raid autodetect
/dev/sdc2 122 38912 38791 311588707+ fd Linux raid autodetect
/dev/sdc3 0 - 0 0 0 Empty
/dev/sdc4 0 - 0 0 0 Empty

root@via:~# sfdisk -l /dev/sdd

Disk /dev/sdd: 38913 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start End #cyls #blocks Id System
/dev/sdd1 0+ 121 122- 979933+ fd Linux raid autodetect
/dev/sdd2 122 38912 38791 311588707+ fd Linux raid autodetect
/dev/sdd3 0 - 0 0 0 Empty
/dev/sdd4 0 - 0 0 0 Empty

root@via:~# sfdisk -l /dev/sde

Disk /dev/sde: 60801 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start End #cyls #blocks Id System
/dev/sde1 0+ 60800 60801- 488384001 83 Linux
/dev/sde2 0 - 0 0 0 Empty
/dev/sde3 0 - 0 0 0 Empty
/dev/sde4 0 - 0 0 0 Empty

root@via:~# sfdisk -l /dev/sdf

Disk /dev/sdf: 60801 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start End #cyls #blocks Id System
/dev/sdf1 0+ 121 122- 979933+ fd Linux raid autodetect
/dev/sdf2 122 38912 38791 311588707+ fd Linux raid autodetect
/dev/sdf3 38913 60800 21888 175815360 83 Linux
/dev/sdf4 0 - 0 0 0 Empty

root@via:~# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Fri Jun 22 20:56:08 2007
     Raid Level : raid1
     Array Size : 979840 (957.04 MiB 1003.36 MB)
  Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
   Raid Devices : 2
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Jul 27 18:18:46 2008
          State : clean
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

           UUID : 19e69537:f7a6aec8:5a5f7576:3fc29e0d
         Events : 0.52

    Number Major Minor RaidDevice State
       0 8 1 0 active sync /dev/sda1
       1 8 33 1 active sync /dev/sdc1

       2 8 81 - spare /dev/sdf1

root@via:~# mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.03
  Creation Time : Fri Jun 22 20:56:31 2007
     Raid Level : raid1
     Array Size : 979840 (957.04 MiB 1003.36 MB)
  Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Sun Jul 27 18:51:38 2008
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 5e728621:c8b356a8:01f8e270:f0e280cb
         Events : 0.54

    Number Major Minor RaidDevice State
       0 8 17 0 active sync /dev/sdb1
       1 8 49 1 active sync /dev/sdd1

root@via:~# mdadm --detail /dev/md2
/dev/md2:
        Version : 00.90.03
  Creation Time : Fri Jun 22 20:56:51 2007
     Raid Level : raid5
     Array Size : 934765824 (891.46 GiB 957.20 GB)
  Used Dev Size : 311588608 (297.15 GiB 319.07 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Sun Jul 27 19:07:06 2008
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 714e46f1:479268a7:895e209c:936fa570
         Events : 0.13760

    Number Major Minor RaidDevice State
       0 8 2 0 active sync /dev/sda2
       1 8 18 1 active sync /dev/sdb2
       2 8 34 2 active sync /dev/sdc2
       3 8 50 3 active sync /dev/sdd2

       4 8 82 - spare /dev/sdf2

root@via:~# mount
/dev/hda5 on / type ext3 (rw,noatime,nodiratime,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
/sys on /sys type sysfs (rw,noexec,nosuid,nodev)
varrun on /var/run type tmpfs (rw,noexec,nosuid,nodev,mode=0755)
varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777)
udev on /dev type tmpfs (rw,mode=0755)
devshm on /dev/shm type tmpfs (rw,size=102400000)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
lrm on /lib/modules/2.6.22-15-generic/volatile type tmpfs (rw)
/dev/hda1 on /boot type ext3 (rw,noatime,nodiratime)
/dev/md2 on /raid5 type ext3 (rw)
none on /sys/kernel/config type configfs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)

Revision history for this message
Bjorn Ruud (bjorn-ruud) wrote :

I can confirm this problem on Ubuntu Server 8.04. I have the same RAID5 configuration as the reporter (4 active disks, 1 spare).

In my case the spare isn't added to the array on boot no matter what changes are made to mdadm.conf. Stating the devices explicitly doesn't help.

$ cat /etc/mdadm/mdadm.conf
[snip]
DEVICE partitions
[snip]
ARRAY /dev/md0 level=raid5 num-devices=4 spares=1 UUID=d7ae6761:1abc675c:6e03419f:9b214e6e

$ sudo mdadm --detail --scan
ARRAY /dev/md0 level=raid5 num-devices=4 spares=1 UUID=d7ae6761:1abc675c:6e03419f:9b214e6e

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdf1[4](S) sda1[0] sdd1[3] sdc1[2] sdb1[1]
      2197715712 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Thu Aug 9 15:29:00 2007
     Raid Level : raid5
     Array Size : 2197715712 (2095.91 GiB 2250.46 GB)
  Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Nov 28 15:06:38 2008
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : d7ae6761:1abc675c:6e03419f:9b214e6e
         Events : 0.76

    Number Major Minor RaidDevice State
       0 8 1 0 active sync /dev/sda1
       1 8 17 1 active sync /dev/sdb1
       2 8 33 2 active sync /dev/sdc1
       3 8 49 3 active sync /dev/sdd1

       4 8 81 - spare /dev/sdf1

Revision history for this message
JanCeuleers (jan-ceuleers) wrote :

I'm sorry for having let this bug linger for such a long time. I think I have now found the proper solution to this problem.

Restating the problem: I have six disks, four of which are served by a sata-sil24 controller, the other two being served by a sata-via controller. The active partitions of my raid5 set live on the first four disks, the spare partitions live on one of the other two disks (the other being used for backup).

With the above setup, the active members of the raid set would come up fine upon boot, but the spares would not be added automatically, so that I had to add them from /etc/rc.local

On my system, this problem has been solved by including both sata modules in the initrd image. That is: I added sata-via and sata-sil24 to /etc/initramfs-tools/modules , rebuilt the initrd images using update-initramfs -u -k all, and everything is now activated automatically on boot.

Bjorn Ruud: can you let us know whether your spare disk is also served by a controller that requires a different module from the one serving your active disks? If so, would you give the above a go and report back?

Revision history for this message
Bjorn Ruud (bjorn-ruud) wrote :

JanCeuleers: Your solution works!

My setup is similar to yours. I have a nVidia mainboard SATA controller with the system disk and spare connected, and a Promise TX4 SATA PCI card with four disks hosting a RAID5 volume. I added the modules sata_nv and sata_promise to /etc/initramfs-tools/modules, rebuilt initrd, rebooted, and now it works as intended.

Thank you for finding a solution. I hope the Ubuntu devs can find a way to automate this or inform users about it.

Revision history for this message
JanCeuleers (jan-ceuleers) wrote :

I have been advised on the linux-raid mailing list [1] that this problem is more likely to be caused by mkinitrd than by mdadm, and that I should therefore file a bug against initramfs-tools.

[1]: http://marc.info/?l=linux-raid&m=123421650832520&w=2

Revision history for this message
Christian Reis (kiko) wrote :

I have this same symptom, but all my disks are on the same controller, so the issue is likely to be different. I wonder what is causing the spares to not be picked up -- could it be mdadm --scan is not doing the right thing?

Revision history for this message
MNLipp (mnl) wrote :

I can confirm the problem (for 9.04) and I have all disks on the same controller (raid5 with 3 + 1 spare). The spare is recognized sometimes, but not always.

Looking at the boot messages, I strongly suspect that it depends on the sequence in which the disks become available to mdadm. Disks are /dev/sd?5 with ?=[abcd], spare is /dev/sdd5. If the sequence is "bind<sd?5>" with ?=[abc] then the spare is not recognized. If one or more of the [abc] binds show up after "bind<sdd5>", then the spare is recognized.

Seems like spares are not added after the raid has been set active.

Revision history for this message
sbrady69 (cesurasean) wrote :

I confirm this same problem on Debian Lenny.

I was not able to fix the problem by adding: DEVICE /dev/hd[abcdef][12] or DEVICE /dev/sd[abcdef][12] to mdadm.conf

this is for IDE hard drives, I assume support has degraded for such devices

Revision history for this message
sbrady69 (cesurasean) wrote :

change amount of spares if this happens on debian lenny after update for RAID 1 config. for some reason 2 devices, and 1 spare did not return emails before, but now is. I have to change spares=1 to spares=0

Revision history for this message
Christian Reis (kiko) wrote :

MNLipp, sbrady69: I think we are actually having a different problem than this bug. Jan's issue is that the spare he has comes from a separate controller (which isn't available in the stock initramfs he was using). I spent a lot of time researching this yesterday, and found that it's actually a race condition: the md devices are assembled too early, before the spare drive has been scanned. I had an additional issue in that I was referring to the drives in mdadm.conf explicitly instead of via UUIDs. I worked around this reliably by using UUIDs in mdadm.conf and by adding a custom sleep script in /etc/initramfs-tools/scripts/local-premount/ ; I'm posting an entry in my diary in case people are interested: http://async.com.br/~kiko/diary.html?date=09.12.2009

Revision history for this message
ceg (ceg) wrote :

I'd consider the missing modules issue as fix released, right?

Christian Reis, does the race still occur in recent releases?
Why is initramfs udev killed instead of transitioned to the rootfs?

https://wiki.ubuntu.com/ReliableRaid

Changed in initramfs-tools:
status: New → Fix Released
Changed in mdadm (Ubuntu):
status: New → Incomplete
Revision history for this message
ceg (ceg) wrote :

Does Bug #550131 "initramfs missing /var/run/mdadm dir (loosing state)" fix this?

Revision history for this message
JanCeuleers (jan-ceuleers) wrote :

Many years later I can no longer reproduce this

Changed in mdadm (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.