Log in

No account? Create an account

Sun, Nov. 22nd, 2009, 04:13 am
On Vox: Fedora 12, Dracut, dmraid, mdadm, oh my!

It appears that Fedora 12 moved to a new boot init system called dracut.  Sadly due to a number of odd circumstances this has caused me much pain.  Here's my basic config

  • /boot and /  on /dev/sda
  • /var and /home on a partitioned software raid on /dev/sd{cd}

After an yum-based upgrade to Fedora 12 I rebooted.  We get to the point where we initialize the software raid and boom.  failure.  I'd seen this before, partitioned raid has always had some trouble in fedora.  Previously I had to modify the rc.sysinit script to reset the raid partitions, so I tried that again, moving that init to later in the boot sequence.  Reboot and yes, it works..

However then I noticed some odd things.  I was only getting a single drive in my mirrored RAID.  Further investigation revealed that I had a device dm-1 instead of sdc or sdd listed in /proc/mdstat...  Uh oh..

Looking more closely it appears that my drives were getting set up by dmraid as a fake-raid mirror:  

# dmraid -r 
/dev/sdd: sil, "sil_aiabafajfgba", mirror, ok, 488395120 sectors, data@ 0
/dev/sdc: sil, "sil_aiabafajfgba", mirror, ok, 488395120 sectors, data@ 0

I tried adding the nodmraid option to grub.conf but then the new dracut system started an infinite spew of messages generated by this mdadm error message string (lifted from Assemble.c)

fprintf(stderr, Name ": WARNING %s and %s appear"
" to have very similar superblocks.\n"
" If they are really different, "
"please --zero the superblock on one\n"
" If they are the same or overlap,"
" please remove one from %s.\n",
devices[best[i]].devname, devname,
inargv ? "the list" :
"the\n DEVICE list in mdadm.conf"

Drats! the mirrored fake raid had already mangled my second drive by duplicating the superblock!  Plus since all this was going on in dracut I couldn't fix it.  So I removed the nodmraid option in grub during boot and dug a little deeper. I found that I could keep dracut from doing all this nonsense by adding the following kernel options:

rd_NO_MD rd_NO_DM nodmraid

This allows for a minimal boot without dmraid or mdadm.  After that I was dropped into single user mode with the dupe superblock message.  To fix this required zeroing the superblock of sdd

mdadm --zero-superblock /dev/sdd1

And then rebooting (again!)

Once past this things started working somewhat normally.  To get my raid mirrored again I did the normal thing:

# mdadm --manage /dev/md_d0 --add /dev/sdd1

To get rid of the false-positive fake raid setup I found that you can do this with the dmraid tool itself:

[root@mirth ~]# dmraid -E -r /dev/sdd

Do you really want to erase "sil" ondisk metadata on /dev/sdd ? [y/n] :y

[root@mirth ~]# dmraid -E -r /dev/sdc

Do you really want to erase "sil" ondisk metadata on /dev/sdc ? [y/n] :y

The really odd thing about this whole incident is that I never had these drives in a fake raid setup before. 

In any case, hope this helps the few other people who might have this same problem.

Originally posted on paul.vox.com

Mon, Nov. 23rd, 2009 03:25 pm (UTC)
deviant_: Problems with dmraid

One of the big problems we've had while implementing dmraid support is that some of the metadata formats are essentially indistinguishable from random data. Usually ddf1 is the false-positive champion (you can find valid looking ddf1 metadata on many brand new, uninitialized disks...), but it has been known to happen with others.

Mon, Jan. 17th, 2011 10:53 am (UTC)
(Anonymous): painter 11

Wow! what an idea ! What a concept ! Beautiful .. Amazing