Solved

how to start/stop /dev/md0 RAID devices

Posted on 2013-06-02
22
826 Views
Last Modified: 2013-06-17
I have set up a RAID-6 on my Linux host using four, 2TB drives. I've also copied about 1.8TB of data to the RAID and it has been a live production NAS device on the office LAN for the past 32 hours.

I followed the well written and extensive instructions at https://raid.wiki.kernel.org/index.php/RAID_setup. However, while this wiki drills down into mind numbing benchmark stats and numerous esoteric configuration options, it doesn't actually give the fundamental, straightforward HOWTO on starting and stopping the RAID at boot and shutdown time.

Here's what I've done so far:

mdadm --create /dev/md0 --metadata 1.2 --verbose --level=6 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

mke2fs -v -m .1 -b 4096 -E stride=128,stripe-width=384 /dev/md0

mdadm --assemble --scan --uuid=39edeb69:297e340f:0e3f4469:81f51a6c

In /etc/fstab I have the following entry, and I have mounted the RAID via `mount /mnt/RAID`:
/dev/md0        /mnt/RAID        ext2        defaults,uid=99,gid=99,umask=0660,dmask=0771  1   1


WHAT NEXT?

How do I start this at boot time? As configured, the fstab entry will automatically mount the /dev/md0 device, but I've gleaned from the docs that the RAID has to be "assembled" each time it is started (and perhaps "assemble" is a synonym for "start"?). I've created a boot script as:

/sbin/mdadm --assemble --scan --uuid=39edeb69:297e340f:0e3f4469:81f51a6c

which will run at boot time. Is this correct? If so, is this sufficient? Should I do this step before or after the device is mounted? I don't remember whether I did the mdadm --assemble or mount first when I did all the by-hand (that was 3 days ago). If I should "assemble" first, then  I'll change the fstab entry to 'noauto' and have the startup script mount after assembling

What about shutdown? I can put the command `/sbin/mdadm --stop /dev/md0` into a shutdown script, but the same question about the mount: should I unmount /dev/md0 before doing the "stop", or just do the stop and let the system handle the unmount? It seems to me the unmounts happen after all shutdown scripts have been run, so does stopping the RAID device before the unmount mess up the flush?

This is relatively urgent because I am very reluctant to shut down this computer right now. I am worried that 3 days work of data copying might be corrupted or lost.

PLEASE ADVISE! THX.
0
Comment
Question by:jmarkfoley
  • 11
  • 11
22 Comments
 
LVL 76

Expert Comment

by:arnold
ID: 39215002
You should have the entry in /etc/mdadm.conf
Chkconfig to make sure mdadm starts at boot.


The start/stop for mdalert/msadm should deal with this.

The issue you may face is the additions the md device into /etc/fstab which deals with mounting the resource.

Which Linux distro are you using?
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39215037
I'm using slackware, so no chkconfig

The only entry in /etc/mdadm.conf is:

ARRAY /dev/md0 metadata=1.2 name=OHPRSstorage:0 UUID=39edeb69:297e340f:0e3f4469:81f51a6c

per the instructions referenced in the link in my original post: https://raid.wiki.kernel.org/index.php/RAID_setup#Saving_your_RAID_configuration

> The start/stop for mdalert/msadm should deal with this.

Yes, I know how to stop/start, the question is how to start/stop at boot/shutdown and what needs to be done with the mount/umount of the /dev/md0 (if anything) and in what order.

Redhat also has boot/shutdown scripts, albeit in /etc/init.d, I believe. Do you have Redhat examples for the mdadm RAID system you could post?
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39215062
The example shown at http://www.tcpdump.com/kb/os/linux/starting-and-stopping-raid-arrays.html does say to unount the filesystem first. So perhaps my shutdown script should be:

/sbin/umount -f /dev/md0
/sbin/mdadm --stop /dev/md0

The example doesn't mention mounting at startup, but if the shutdown example is correct (which it might not be!), it seems that the reverse should be true at startup:

/sbin/mdadm --assemble --scan --uuid=39edeb69:297e340f:0e3f4469:81f51a6c
mount /dev/md0

I believe all auto mounts in /etc/fstab are mounted before startup scripts are run. Therefore, if the above startup script is correct, I need to make the /dev/md0 entry in /fstab noauto and let the start up script do the mounting.

I'm soooooo confused!!!!

Somebody out there has done this, right?
0
 
LVL 76

Expert Comment

by:arnold
ID: 39215097
There are start/stop scripts that deal with mounting/unmountng file systems using fstab. As well as assembling the raid devices.

http://www.slackware.com/~mrgoblin/raid1-slackware-12.php

Includes referencs to start up scripts.

There should not be any manua action on your part.
A shutdown should do all the necessary things.
As does the startup should assemble the md0 device and then mount it based on the fstab settings.
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39218043
arnold: thanks for the post. Your link gives a start/stop script for RAID monitoring, but not starting/stopping the RAID. Odd that no one talks about that. Your link scripts are a bit dated and reference older Slackware distro and older mdadm version, but the monitoring script looks very useable so I will save that. Thanks!

I came across another link, http://ubuntuforums.org/showthread.php?t=872092,  where a person was reporting that same issue as me, but he actually tried rebooting and the RAID drive did in fact fail to auto mount, supporting my fear. The thread gets unclear toward the end, but I think he concluded that he needed to assemble the rate *before* mounting the filesystem.

If that's the case, I think the safe thing would be to NOT automount the RAID from fstab at boot time and do it in the init script. So, I would end up with:

fstab entry:
/dev/md0        /mnt/RAID       ext2        noauto,exec,async,uid=99,gid=99,umask=0660,dmask=0771  1   1

etc/rc.d/rc.RAID
case $1 in
"start" )
    echo Starting RAID
    /sbin/mdadm --assemble --scan --uuid=`/usr/bin/grep "^ARRAY /dev/md0" /etc/mdadm.conf | \
        /usr/bin/awk 'BEGIN{RS=" "}{print $0}' | /usr/bin/grep UUID= | /usr/bin/cut -d= -f2`
    /sbin/mount /dev/md0
    ;;

"stop" )
    x=`/usr/bin/df | grep /mnt/RAID`

    if [ -n "$x" ]
    then
        echo Stopping RAID
        /sbin/umount -f /dev/md0
        /sbin/mdadm --stop /dev/md0
    else
        echo RAID not started
    fi
    ;;

* )
    echo "Syntax: $0 [ start | stop ]"
    ;;
esac

Open in new window


Another bit that might be a factor is setting the actual /dev/sd[1-n] individual drive partitions to type  (fd) Linux Raid Autodetect, which I read about after creating and loading the array. I haven't come across anything that explains what this does or what it is needed for. My system seems to work just fine having used the usual partition type 83. Could this be a factor in start up? I fear to change these at this point (would that generate a different UUID?)

I'm going to leave this question open until at least the weekend which is when I will try to shut down and restart the computer (and RAID). Meanwhile, I look forward to more comments. Surely someone (arnold?) in the EE community has actually implemented a mdadm RAID system and has actual init scripts?
0
 
LVL 76

Expert Comment

by:arnold
ID: 39218071
83 is not a valid Software RAID partition type.
could you post the output fdisk -l /dev/sda and similarly for /dev/sdb?
0
 
LVL 76

Expert Comment

by:arnold
ID: 39218776
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39229374
Thanks for the link. The author doesn't say you *can't* use 83 as the partition type, just that he prefers using FD and the auto-detect method. My worry now is that if I change the partition type after the fact, and after the raid has been in extensive use for a week, that I might mess up the UUID or otherwise screw something up. I have not stopped/started the RAID yet. I plan on doing so tomorrow. At this point I plan on using the init scripts in my previous post.

I'm really unclear about the start/stop procedure ... whether to mount before or after starting the RAID, what "autodetect" actually does (starting? mounting?). While there is a lot of excellent md RAID documentation out there, this fundamental procedure is completely left out of all instructions I have found. Apparently, it should be dead-obvious and not worth explaining. Do you have a RAID setup? If so, what's in your fstab? init scripts?

Do you think I should change the partition type? Do you think it will cause a problem or not?

Here is my fdisk -l for sda and sdb:

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x3127f1d2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048  3907029167  1953513560   83  Linux

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
81 heads, 63 sectors/track, 765633 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0e360d03

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048  3907029167  1953513560   83  Linux
0
 
LVL 76

Expert Comment

by:arnold
ID: 39229659
In the absence of the FD type, you have to make sure that you have the device defined in /etc/mdadm.conf


You could do the following.
Break the md device by removing one /dev/sda
Then repartition it and alter its type to FD, then add it back in.
Once the rebuild is complete, break the device again by pulling /dev/sdb out and making the changes and then adding it back in.
Make sure the rebuild is complete as well as make sure to write the master boot record.
Neither in your example is indicated as a boot device.

You are using the entire disk which I tend not to do.
I create multiple raid devices rather a single disk as you have.
The multiple raid device process provides more flexibility.  i.e. you can add a third, fourth, fifth, etc. and then distribute the various raid volumes accross multiple disks.  In your case, you are bound by the two disks for this.

/boot primary raid volume
swap is its own partition no raid needed
/ /usr/ var/ /var/log can be on a single RAID volume with LVM overlay
/home

etc.
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39229892
The RAID is not on the boot drive.

Actually, I *am* using multiple raid devices:

$ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md0 : active raid6 sdd1[3] sdc1[2] sdb1[1] sda1[0]
      3907023872 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>

> You are using the entire disk which I tend not to do.

I did create one partition, but I don't think I used the *entire* disk. Using the instructions in https://raid.wiki.kernel.org/index.php/Linux_Raid I believe I left 1% unused (-m .1):

mke2fs -v -m .1 -b 4096 -E stride=128,stripe-width=384 /dev/md0

or maybe we're not talking about the same thing.

I don't have an actual DEVICE configured in /etc/mdadm.conf, but I do have the array configured. Is this a problem?

ARRAY /dev/md0 metadata=1.2 name=OHPRSstorage:0 UUID=39edeb69:297e340f:0e3f4469:81f51a6c

Also from mdadm --detail:

$ mdadm --detail /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Wed May 29 20:34:53 2013
     Raid Level : raid6
     Array Size : 3907023872 (3726.03 GiB 4000.79 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Fri Jun  7 13:39:38 2013
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : OHPRSstorage:0  (local to host OHPRSstorage)
           UUID : 39edeb69:297e340f:0e3f4469:81f51a6c
         Events : 19

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1

> You could do the following. Break the md device by removing one /dev/sda

Yeah, well, that sounds like a heart attack inducer! I have a spare lab-rat computer. I'll try this there and see what happens before I potentially lose 3TB of files!

You wrote, "You are using the entire disk which I tend not to do." This statement implies that you are actually using a RAID. What do you do to start/stop it at boot/shutdown time? Could you share your init script[s] and /etc/fstab? I have to take this down for a hardware upgrade tomorrow!
0
 
LVL 76

Expert Comment

by:arnold
ID: 39229948
The type if fd and it automatically detected and assembled by I believe mdmonitor script.

I also have the entry in /dev/mdadm.conf that deals with assembling.
It should create the md0
http://www.linuxmanpages.com/man5/mdadm.conf.5.php
Your 1 1 on the in fstab might be what delays the mounting.

Use dmesg
your UUID reference might be incorrect such that you may need in mdadm.conf use devices=the devices in order.

 as simple as  /dev/sd[a-d]1
ARRAY /dev/md0 level=6 devices=/dev/sd[a-d]1
Or
ARRAY /dev/md0 level=6 raid-devices=4 /dev/sd[a-d]1
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 1

Author Comment

by:jmarkfoley
ID: 39230121
I'll check out your link

> Your 1 1 on the in fstab might be what delays the mounting.

Actually, this is a new setup. I haven't rebooted yet, so the fstab entry hasn't done anything. I'm trying to tweak the fstab. The 1 1 is for fsck'ing. Since I've built a ext2 filesystem on top of the RAID, don't I still need to fsck every so often?

my mdadm.conf ARRAY entry was created by `mdadm --detail --scan >>/etc/mdadm.conf` per the instructions I've reference. According to that website, using the UUID means I don't have to explicity list the devices. All theory at this point.

> The type if fd and it automatically detected and assembled by I believe mdmonitor script.

So that's it? You have no fstab entry and no init scripts other than one that runs mdmonitor?
0
 
LVL 76

Assisted Solution

by:arnold
arnold earned 500 total points
ID: 39230470
The fstab entry for where you want the partition mounted has to be there.  My understanding of your question dealt with the assembly of the RAID device (md0) at bootup.

So long as there is an entry in mdadm.conf mdmonitor script should start the mdadm --monitor --scan.
in the absence of fd partition types, /etc/mdadm.conf is used for assembly.

ps -ef | grep mdadm do you have a mdamd --monito --scan running?
0
 
LVL 1

Accepted Solution

by:
jmarkfoley earned 0 total points
ID: 39242218
After much experimentation, I've finally got a resolution that works. It turns out that what you've stated is correct, I do not need to do anything special at boot except to have an entry in fstab. I do have the partitions types now set to FD, and I do have an ARRAY entry in /etc/mdadm.conf. So, the system is apparently using one or the other to automatically start the RAID at boot time. The fstab entry gets it mounted. My settings are:

/etc/mdadm.conf:
ARRAY /dev/md0 metadata=1.2 name=OHPRSstorage:0 UUID=39edeb69:297e340f:0e3f4469:81f51a6c

/etc/fstab:
/dev/md0        /mnt/RAID       ext2        defaults         1   1

I am keeping the 1 1 fsck parameters at the end because this is an ext2 filesytem after all, and should probably be routinely checked.

Shutting down, however,  is a different story. Since Samba and NFS are using this share, I get a cannot stop md0 message when rebooting (and maybe a cannot umount message as well, I can't recall). I mildly panicked when I saw that, but when it rebooted everything seemed fine, so perhaps it doesn't matter if things aren't stopped and umounted properly. Nevertheless, to be safe, I decided to  shut things down cleanly. I have the following script in /etc/rc.d/rc.RAID which is invoked by /etc/rc.d/rc.local_shutdown (Slackware):

case $1 in
"start" )
    echo Starting RAID
    /sbin/mdadm --assemble --scan --uuid=`/usr/bin/grep "^ARRAY /dev/md0" /etc/mdadm.conf | \
        /usr/bin/awk 'BEGIN{RS=" "}{print $0}' | /usr/bin/grep UUID= | /usr/bin/cut -d= -f2`
    /sbin/mount /dev/md0
    ;;

"stop" )
    x=`/usr/bin/df | grep /mnt/RAID`

    if [ -n "$x" ]
    then
        echo Stopping RAID

        # Check for RAID in use by samba
        x=`/usr/bin/lsof /mnt/RAID | grep -i smbd`

        if [ -n "$x" ]
        then
            echo $0 RAID in use by samba, stopping
            /etc/rc.d/rc.samba stop
        fi

        echo Stopping nfsd from $0
        /etc/rc.d/rc.nfsd stop
        /sbin/umount -f /dev/md0
        /sbin/mdadm --stop /dev/md0
    else
        echo RAID not started
    fi
    ;;

* )
    echo "Syntax: $0 [ start | stop ]"
    ;;
esac

Open in new window


The 'start' option is not used by rc.local at boot time, as I said. It is there in case I want to start up the md device after manually shutting it down.

That does the trick! Mission accomplished. I have a 3.6T RAID-6 that everyone in the office seems to be using w/o problem. Eventually I will add the following to /etc/rc.d/rc.local to start the monitoring, but I haven't gotten to that yet:

mdadm --monitor --scan --mail=user@somehost

Thanks for your help!
0
 
LVL 76

Expert Comment

by:arnold
ID: 39243133
The notice about the unmount is the same you overcome with the umount -f in your script. Which is run anyway.
You can change the partition type from ext2 to ext3 without having to reformat.
I.e. mount /dev/md0 ext3 /mnt/raid
Adds journaling. I.e, ext3 is xt2 +journaling.
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39243338
I don't remember, but I think I tried umount -f and it didn't work with Samba and NFS clients still active. The man page mentions an unreachable NFS host ... anyway, I'd have to put the umount -f into a shutdown script, so might as well kill of Samba and NFS for cleanliness.

Not really sure what journaling buys me -- just seems like more overhead. What's th benefit?
0
 
LVL 76

Expert Comment

by:arnold
ID: 39243352
Double check that your service start/stop scripts Start and Kend in reverse sequence. I.e. a service that starts last, must be terminated first
S90 K10
S10 K90

Have not dealt with Slackware for a long time.
/etc/rc2.d/
Should have a start and an stop script reference to/etc/init.d/service as a symbolic link.
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39245086
As I mentioned, I don't need a start script for this setup. It appears that NFS starts before Samba. I'll check out the start-order thing you mentioned, but I don't think Samba and NFS inter-depend.

I've used Slackware since mid-90's, after using 386BSD and FreeBSD and before all the other distros were invented. I've liked it because its init and config setup were more along the traditional System V and BSD lines (which I used in the 80's!) than the others, but lately its getting hard to find recent packages already built for Slackware. I guess Patrick Volkerding is getting too busy and the Slackware community is apparently shrinking. Often, I don't even see it listed on web distro lists. I had to use a lot of elbow grease to get SpamAssassin installed recently. I've found that RedHat packages work with little problem except for the location of the init scripts.

Slackware keeps most of its init scripts in /etc/rc.d and, as a matter of fact, /etc/init.d is symbolically linked to /etc/rc.d, not visa-versa. There is no 'service' file in that folder, but there is one in /usr/lib/pm-putils/bin/service which has the comment: "Handle service invocation on distros that do not have a "service" command. It handles LSB by default, and other distros that the maintainer is aware of." This is the first time I've ever noticed the existance of that file (thanks to your prompting). I'll check it out and see what it does, though I've pretty much gotten used to starting/stopping the rc.d scripts directly.

Thanks for the feedback.
0
 
LVL 76

Expert Comment

by:arnold
ID: 39245678
NFS is a service that usually starts near the end of the process rc3.d rc2.d
samba is similar.
You have to make sure that within the same that samba/nfs are stopped first.
the same for clients.
rc.d is top level. do you have rc1.d rc2.d rc3.d
the others are runlevels and usually each run level has its own services that run within.
i.e. certain services do not start until they get into multi-user mode.
If it works for you, that is great.
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 39247842
rc.d is top level. do you have rc1.d rc2.d rc3.d

Yes, the rc.d folder has all those. rc.local is only run at multi-user level (rc.M) and  rc.local_shutdown is run by rc.0, rc.6 and rc.K.

I think I've got things in the right order and it does appear to be working OK. I've checked shutdown messages as it's going down and it looks good.

Again, thanks for your help and feedback.
0
 
LVL 76

Expert Comment

by:arnold
ID: 39248234
Good.  rc.local is run last and is often used for specific commands to be added within.  I.e. items that are not services.
0
 
LVL 1

Author Closing Comment

by:jmarkfoley
ID: 39252593
Final solution.
0

Featured Post

Network it in WD Red

There's an industry-leading WD Red drive for every compatible NAS system to help fulfill your data storage needs. With drives up to 8TB, WD Red offers a wide array of solutions for customers looking to build the biggest, best-performing NAS storage solution.  

Join & Write a Comment

Introduction We as admins face situation where we need to redirect websites to another. This may be required as a part of an upgrade keeping the old URL but website should be served from new URL. This document would brief you on different ways ca…
Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now