Solved

RAID 5 disks dropped from array.

Posted on 2010-08-21
6
974 Views
Last Modified: 2012-06-27
Hi All,

I have an OpenFiler box that has (or had) a software RAID5 array with 4x 1.5TB disks which got a bit full so i added another 3 disks (1 as a spare). About 20% through the expansion i had a brain f**t and bounced the box (too many putty windows).

Now /dev/md1 wont come up.
It would appear that 2 of the 6 disks don't have valid partition tables and so mdadm thinks 2 disk have failed.

see attached for the mdadm --examine.....

needless to say the 2 duff disks are the additions. the data is still there i'm sure but i need to pursuade mdadm that it is. Can i fix/re-add these disk anyhow?

/dev/sdi1 and /dev/sdj1 just wont mount.

This is the iscsi backend for an esx lab which is not backed up so could really do with getting the data back. I know, i know - i always lecture people too but this just grew "organically" and has literally dozens of template machine etc on it. not mission critical and too big to backup properly.

Bit of a n00b with Linux/OF so hand holding welcome and any advise appreciated!

Cheers

Mark

mdadm --misc --examine /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 00.91.00
           UUID : cb999fbb:935ca328:4dd0ed51:70bb0666
  Creation Time : Wed Aug 18 10:02:41 2010
     Raid Level : raid5
  Used Dev Size : 1465134912 (1397.26 GiB 1500.30 GB)
     Array Size : 7325674560 (6986.31 GiB 7501.49 GB)
   Raid Devices : 6
  Total Devices : 7
Preferred Minor : 1

  Reshape pos'n : 1731242560 (1651.04 GiB 1772.79 GB)
  Delta Devices : 2 (4->6)

    Update Time : Sat Aug 21 12:51:22 2010
          State : clean
 Active Devices : 6
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 1
       Checksum : 8547782d - correct
         Events : 0.230778

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       65        0      active sync   /dev/sde1

   0     0       8       65        0      active sync   /dev/sde1
   1     1       8       81        1      active sync   /dev/sdf1
   2     2       8       97        2      active sync   /dev/sdg1
   3     3       8      113        3      active sync   /dev/sdh1
   4     4       8      161        4      active sync
   5     5       8      145        5      active sync
   6     6       8      129        6      spare

Open in new window

0
Comment
Question by:Share-IT
  • 4
  • 2
6 Comments
 
LVL 76

Accepted Solution

by:
arnold earned 500 total points
ID: 33492041
One option is to try to reconstitute the original 4 device Array
mdadm -A dev1 dev2 dev3 dev4.
Then grow them by adding one drive at a time.

Another option is to use
mdadm -A --force dev1 dev2 dev3 dev4 dev? wher dev? is the one you think failed last out of the two.
0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492113
thanks for the reply.

When i run " mdadm -A /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1" i get the following error...

Thanks again




mdadm -A /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1
mdadm: cannot open device /dev/sdf1: Device or resource busy
mdadm: /dev/sdf1 has no superblock - assembly aborted

Open in new window

0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492560
Ok. After some jiggery pokery, i've managed to get the command to run by adding /dev/md1 in the command but now i get...

mdadm: /dev/md1 assembled from 4 drives and 1 spare - not enough to start the array.

Basically i'm missing sdj1 and sdk1 even though the disk are there and working.

Here's the "parted" info.
====
(parted) print
Disk geometry for /dev/sdj: 0.000-1430799.398 megabytes
Disk label type: gpt
Minor    Start       End     Filesystem  Name                  Flags
1          0.017 1430795.907                                    raid
(parted) check 1
Error: Could not detect file system.
====

If i could somehow get the partitions to pop up i'm sure the array could continue to rebuild as i have not had a physically failed disk as such.



Anyone?

Cheers

Mark

0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 76

Assisted Solution

by:arnold
arnold earned 500 total points
ID: 33492759
boot into singleuser and let it finish the build
mdadm -A --scan --run

You are looking for it to resume the array growth
What is the output of mdadm --examine --scan  --detail
http://linux.die.net/man/8/mdadm
0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492848
Hi Arnold, thanks for your help on this.

In the end i just figured i'd delete the partitions in the OpenFiler GUI and recreate them again. As soon as i did that a simple "mdadm -A /dev/md1" did the trick and the array has continued where it left off at 23%. All VGs are back etc.

Only issue i have now is that neither the iSCSI LUNs or XFS shares that are on that volume appears to have come online properly. I can see it in the OF gui but cant actually access it. Thinking a bounce of the OF box might be needed but i'll wait until it's properly rebuilt before going too mad. At least there's progress. Just hope all of the data is intact and i dont have a blank 9TB array.

Any thoughts on bringing them online without a reboot?

For the record mdadm --examine --scan (-D apparantly cant be used with -E)  gives..

ARRAY /dev/md0 level=raid5 num-devices=3 UUID=d37e7e3b:1243fcb4:a5edb78b:895e7aca
ARRAY /dev/md1 level=raid5 num-devices=6 UUID=cb999fbb:935ca328:4dd0ed51:70bb0666


0
 
LVL 8

Author Comment

by:Share-IT
ID: 33492929
OK, got impatient and bounced the box. Wasn't gonna wait for 2 days to find there's nothing there. It's all good. all xfs and iscsi volumes are back.

Thanks again and whilst ultimately it was self resolved, i appreciated the time you spent trying to help so have some points!




0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Join & Write a Comment

In this tutorial I will explain how to make squid prevent malwares in five easy steps: Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now