Fix a Linux RAID Configuration Problem

Posted on 2011-09-12
Last Modified: 2016-12-08
I am having an issue with a Linux RAID array I use as a NAS server.  Everything seems to work with no issue.  No files seem to be corrupted and everything is accessible, but I have found a problem with the configuration.

Hardware Configuration:
Ubuntu Server 10.10 (No GUI)
(1) 250 GB Western Digital HDD (Ubuntu)
(11) 1TB Western Digital HDDs (RAID5 Array)

I used Wemin for the basic setup of the RAID and I believe that is where my problem began.

DEVICE partitions /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1
CREATE owner=root group=disk mode=0660 auto=yes
HOMEHOST <system>
DEVICE /dev/sdb1 /dev/sdc1 /dev/sdd1
ARRAY /dev/md0 level=raid5 devices=/dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1,/dev/sdf1,/dev/sdg1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdk1,/dev/sdl1

Open in new window

sudo mdadm --examine –scan:
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=e6f6ec4e:fefe1ee2:e40a34ba:0ca07f7d
ARRAY /dev/md0 level=raid5 num-devices=11 UUID=e7d757df:a4dd8c94:e40a34ba:0ca07f7d

Open in new window

cat /proc/mdstat:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid5 sdj1[7] sdh1[5] sdg1[6] sdi1[8] sdl1[10] sdk1[9] sdd1[2] sdb1[0] sde1[3] sdc1[1] sdf1[4]
      9767599360 blocks level 5, 64k chunk, algorithm 2 [11/11] [UUUUUUUUUUU]

Open in new window

I'm concerned because the mdadm.conf file has two DEVICE lines (not necessarily a problem), but only one ARRAY line that uses partitions from both DEVICE lines.  The mdadm scan showed two different arrays.  I would like to fix the mdadm.conf file, but am concerned about loosing data.

Can I simply combine the two DEVICE lists?  Normally I would just guess and check, but I really don't want to loose this data.

Thanks for the help.
Question by:WickedShamrock
  • 6
  • 5
LVL 77

Expert Comment

ID: 36525218
mdadm --examine --brief --scan --config=partitions
mdadm -Ebsc partitions

post the output of the above.
There are other examples for mdadm to get info.

It should tell you which array is degraded and which is the failed disk.

Author Comment

ID: 36525296
Thanks for the response.  The following are the results of those commands.  Hope that helps.  As far as I understand it nothing is degraded, but if I really knew I wouldn't be posting on this forum!  Thanks again.

sudo mdadm --examine --brief --scan --config=partitions
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=e6f6ec4e:fefe1ee2:e40a34ba:0ca07f7d
ARRAY /dev/md0 level=raid5 num-devices=11 UUID=e7d757df:a4dd8c94:e40a34ba:0ca07f7d

Open in new window

sudo mdadm -Ebsc partitions
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=e6f6ec4e:fefe1ee2:e40a34ba:0ca07f7d
ARRAY /dev/md0 level=raid5 num-devices=11 UUID=e7d757df:a4dd8c94:e40a34ba:0ca07f7d

Open in new window

LVL 77

Expert Comment

ID: 36526141
you have two devices with the same name /dev/md0

you should make sure that each device is unique /dev/md0 /dev/md1.


Author Comment

ID: 36526419
That's what I was originally thinking, but the output of df makes me think that it's not two separate devices.  The filesystem size is correct for an 11 1TB disk RAID5 array.

(11 - 1) * 1024 = 10240


(3 - 1) * 1024 = 2048
(8 - 1) * 1024 = 7168 = 9216

df /mnt/raid/:
Filesystem     1K-blocks    Used         Available    Use%   Mounted on
/dev/md0       9614330456   8049836092   1076114404   89%   /mnt/raid

Open in new window

LVL 77

Expert Comment

ID: 36526482

/var/log/messages should tell you that there is an error when you try to start /dev/md0 as a raid 5 consisting of 3 devices as not possible i.e. /dev/md0 is a raid5 made up of 11 devices.
i.e. the order of your RAID creation was 3 device RAID 5 and then 11 device RAID5 The last one is the one that is enforced.

Free camera licenses with purchase of My Cloud NAS

Milestone Arcus software is compatible with thousands of industry-leading cameras for added flexibility. Upon installation on your My Cloud NAS, you will receive two (2) camera licenses already enabled in the software. And for a limited time, get additional camera licenses FREE.


Author Comment

ID: 36526661
That makes a lot of sense.  I looked at the log file (attached)– I don't see anything glaringly wrong (nothing states failed), but I would like to go ahead and try combining the two DEVICE lines and restarting the array.  If we are wrong, will that damage anything automatically? ie Will it trigger some kind of repair that would potentially cause me to loose data?  

I'm not sure if I am blowing this whole thing out of proportion.  I consider myself computer savvy, but this is my first experience with RAID and don't fully understand how mdadm functions.  Thanks again.
LVL 77

Expert Comment

ID: 36526788
Comment out the /dev/md0 that assembles the /dev/md0 as raid 5 of three devices.
If you try it the other way /dev/md0 as a raid 5 of three devices it will fail. The data will be lost if you reinitiate/recreate the /dev/md0 as a raid 5 of three devices.

The error will show up during bootup when /dev/md0 is reassembly attempting three device.

the command dmesg will also show an error when /dev/md0 assembly is attempted with the raid 5 three devices.

But all good practice is to have a good backup.
Do not erase/delete/initialize the array.

Author Comment

ID: 36545014
Thanks again for the help.  I am in the process of backing up my data before I try anything.  I will keep you posted.

Accepted Solution

WickedShamrock earned 0 total points
ID: 36594146
I finished backing up my data and have fixed the problem.  I somehow had two sets of superblocks.  It was probably the remnants of my initial "testing" period when I first started playing around with mdadm.  I the following superblocks assigned:

1. /dev/sd[bcd]
2. /dev/sd[bcdefghijkl]1

I had a superblock assigned to the raw drives of group 1.  All I had to do was run --zero-superblock on group 1 and I was good.

I also removed the DEVICE lines from my mdadm.conf file and changed the ARRAY line to use the UUID, but it was working fine prior to changing it.

Author Closing Comment

ID: 36813448
I appreciate the ideas provided, but the actual problem was not fixed by anything proposed.
LVL 77

Expert Comment

ID: 36594187
Pointed out that you had one device /dev/md0 referenced by two raid setup groups.
Suggested Backups before trying anything which is a requirement since if I told you to try something and the RAID became corrupt .... loss of data.

Featured Post

Superior storage. Superior surveillance.

WD Purple drives are built for 24/7, always-on, high-definition security systems. With support for up to 8 hard drives and 32 cameras, WD Purple drives are optimized for surveillance.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
a free alternative to cpanel? 1 58
Buffalo Terastation - Drive and Raid configuration 5 37
AWS- KeepAlived notify script not working 23 49
Access_log 17 101
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Over the last few years, video game fans have found a new favorite pastime: Watching and creating live-streaming gaming sessions. Twitch.TV has emerged as an industry leader in making game streaming painless for those looking to share their Starcraf…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now