Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

RAID 5 disks dropped from array.

Posted on 2010-08-21
6
984 Views
Last Modified: 2012-06-27
Hi All,

I have an OpenFiler box that has (or had) a software RAID5 array with 4x 1.5TB disks which got a bit full so i added another 3 disks (1 as a spare). About 20% through the expansion i had a brain f**t and bounced the box (too many putty windows).

Now /dev/md1 wont come up.
It would appear that 2 of the 6 disks don't have valid partition tables and so mdadm thinks 2 disk have failed.

see attached for the mdadm --examine.....

needless to say the 2 duff disks are the additions. the data is still there i'm sure but i need to pursuade mdadm that it is. Can i fix/re-add these disk anyhow?

/dev/sdi1 and /dev/sdj1 just wont mount.

This is the iscsi backend for an esx lab which is not backed up so could really do with getting the data back. I know, i know - i always lecture people too but this just grew "organically" and has literally dozens of template machine etc on it. not mission critical and too big to backup properly.

Bit of a n00b with Linux/OF so hand holding welcome and any advise appreciated!

Cheers

Mark

mdadm --misc --examine /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 00.91.00
           UUID : cb999fbb:935ca328:4dd0ed51:70bb0666
  Creation Time : Wed Aug 18 10:02:41 2010
     Raid Level : raid5
  Used Dev Size : 1465134912 (1397.26 GiB 1500.30 GB)
     Array Size : 7325674560 (6986.31 GiB 7501.49 GB)
   Raid Devices : 6
  Total Devices : 7
Preferred Minor : 1

  Reshape pos'n : 1731242560 (1651.04 GiB 1772.79 GB)
  Delta Devices : 2 (4->6)

    Update Time : Sat Aug 21 12:51:22 2010
          State : clean
 Active Devices : 6
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 1
       Checksum : 8547782d - correct
         Events : 0.230778

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       65        0      active sync   /dev/sde1

   0     0       8       65        0      active sync   /dev/sde1
   1     1       8       81        1      active sync   /dev/sdf1
   2     2       8       97        2      active sync   /dev/sdg1
   3     3       8      113        3      active sync   /dev/sdh1
   4     4       8      161        4      active sync
   5     5       8      145        5      active sync
   6     6       8      129        6      spare

Open in new window

0
Comment
Question by:Share-IT
  • 4
  • 2
6 Comments
 
LVL 77

Accepted Solution

by:
arnold earned 500 total points
ID: 33492041
One option is to try to reconstitute the original 4 device Array
mdadm -A dev1 dev2 dev3 dev4.
Then grow them by adding one drive at a time.

Another option is to use
mdadm -A --force dev1 dev2 dev3 dev4 dev? wher dev? is the one you think failed last out of the two.
0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492113
thanks for the reply.

When i run " mdadm -A /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1" i get the following error...

Thanks again




mdadm -A /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1
mdadm: cannot open device /dev/sdf1: Device or resource busy
mdadm: /dev/sdf1 has no superblock - assembly aborted

Open in new window

0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492560
Ok. After some jiggery pokery, i've managed to get the command to run by adding /dev/md1 in the command but now i get...

mdadm: /dev/md1 assembled from 4 drives and 1 spare - not enough to start the array.

Basically i'm missing sdj1 and sdk1 even though the disk are there and working.

Here's the "parted" info.
====
(parted) print
Disk geometry for /dev/sdj: 0.000-1430799.398 megabytes
Disk label type: gpt
Minor    Start       End     Filesystem  Name                  Flags
1          0.017 1430795.907                                    raid
(parted) check 1
Error: Could not detect file system.
====

If i could somehow get the partitions to pop up i'm sure the array could continue to rebuild as i have not had a physically failed disk as such.



Anyone?

Cheers

Mark

0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 77

Assisted Solution

by:arnold
arnold earned 500 total points
ID: 33492759
boot into singleuser and let it finish the build
mdadm -A --scan --run

You are looking for it to resume the array growth
What is the output of mdadm --examine --scan  --detail
http://linux.die.net/man/8/mdadm
0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492848
Hi Arnold, thanks for your help on this.

In the end i just figured i'd delete the partitions in the OpenFiler GUI and recreate them again. As soon as i did that a simple "mdadm -A /dev/md1" did the trick and the array has continued where it left off at 23%. All VGs are back etc.

Only issue i have now is that neither the iSCSI LUNs or XFS shares that are on that volume appears to have come online properly. I can see it in the OF gui but cant actually access it. Thinking a bounce of the OF box might be needed but i'll wait until it's properly rebuilt before going too mad. At least there's progress. Just hope all of the data is intact and i dont have a blank 9TB array.

Any thoughts on bringing them online without a reboot?

For the record mdadm --examine --scan (-D apparantly cant be used with -E)  gives..

ARRAY /dev/md0 level=raid5 num-devices=3 UUID=d37e7e3b:1243fcb4:a5edb78b:895e7aca
ARRAY /dev/md1 level=raid5 num-devices=6 UUID=cb999fbb:935ca328:4dd0ed51:70bb0666


0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492929
OK, got impatient and bounced the box. Wasn't gonna wait for 2 days to find there's nothing there. It's all good. all xfs and iscsi volumes are back.

Thanks again and whilst ultimately it was self resolved, i appreciated the time you spent trying to help so have some points!




0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

The purpose of this article is to demonstrate how we can use conditional statements using Python.
It’s 2016. Password authentication should be dead — or at least close to dying. But, unfortunately, it has not traversed Quagga stage yet. Using password authentication is like laundering hotel guest linens with a washboard — it’s Passé.
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question