ubuntu RAID issue - Software - different data on drives

bmsjeff
bmsjeff used Ask the Experts™
on
I have no Ubuntu experience, so any help would be grateful.
I was getting this error.
/dev/sda1 clean
fsck 1.40.8
/dev/md0 contains a file system with errors, check forced
Duplicate or bad block in use!
/dev/md0: Multiply-claimed block(s) in inode 8: 24576
/dev/md0: Multiply-claimed block(s) in inode 4981200: 24576

I ran:
df -h
/dev/sda1      19g      749 used      17g available
/dev/md0      128g      27g used      95g available

cat /proc/mdstat
md0 : active raid1 sdb3[1] 134801344 blocks [2/1] [_u]

I ran DLGDIAG on both drives and they are ok.
There was a bad stick of memory that has been removed.

I  run Live CD on Drive0
fsck
Errors corrected on Drive0
Drive Boots properly

Live CD on Drive1
fsck
Receive error "No usable shell was found on your root file system" on Drive1

When I look at Drive 0, there is no data on it new than 5/2/11
The data was there before I pulled the drives.
I assume this is a problem with the RAID

I ran DiskInternals and saved the Drive1 data to my Windows PC

When I try to boot to Drive1 it just sits there, nothing

What steps should I take?

If if run this on Drive0 alone I get
cat /proc/mdstat
md0 : active raid1 sdb3[0] 134801344 blocks [2/1] [U_]

Could Drive1 be a grub problem?  Not sure what steps to take next.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Author

Commented:
I have placed the drives back into the box and all my folders are there.
Drive0 seams to be the problem
Drive1 appears to have the data, although it will not boot.

sudo mdadm -D /dev/md0
version 00.90.03
Raid Level: raid1
Array Size: 134801344 (128.56 GiB)
Raid Devices: 2
Total Devices: 1
Preffered Minor: 0
Persistence: Superblock is persistent
State: clean, degraded
Active Devices 1
Working Devices 1
Failed Devices 0
Spare Devices 0
Events 0.1614

Number Major Minor RaidDevice State
0             8         3          0           Active sync /dev/sda3
1             0         0          1           removed


sudo mdadm -E /dev/sda1
Checksum: f7866587 - correct

           Number Major Minor RaidDevice State
this         0             8         3          0           Active sync /dev/sda3
               0             8         3          0           Active sync /dev/sda3
               1             0         0          1           faulty removed

Where do I go from here?
Kerem ERSOYPresident

Commented:
Hi,

First of all when you have 2 drives only the first drive will have the boot record. This is why your drive0 boots while the Drive1 can't. Bit you can transfer the MBR to second drive with a command below:

dd if=/dev/sda of=/dev/sdb bs=512 count=1


This is being said now lets come to you problem. It seems that though you have a raid 1 array, it seems that one of your drives belong to was dropped from the RAID and outdated since.

Number Major Minor RaidDevice State
0             8         3          0           Active sync /dev/sda3
1             0         0          1           removed


This shows that your volume is running with only one drive.

So since it is actually intact but outdated we need to re-add this drive to the raid with a command such as this:

mdadm --manage --add /dev/md0 /dev/sdb3  

Since the missing volume name is: /dev/sdb from the info you've provided :

cat /proc/mdstat
md0 : active raid1 sdb3[0] 134801344 blocks [2/1] [U_]


After you've re-added the drive, check the output of /proc/mdstat and make sure that the array is rebuilding.

Cheers,
K.




President
Commented:
Oopps sorry the dd should be jus the other way round.

but to be on the safe side le'ts save both partitions first:

dd if=/dev/sda of=mbrsda bs=512 count=1
dd if=/dev/sdb of=mbrsdb bs=512 count=1

dd if=/dev/sdb of=/dev/sda bs=512 count=1
Acronis in Gartner 2019 MQ for datacenter backup

It is an honor to be featured in Gartner 2019 Magic Quadrant for Datacenter Backup and Recovery Solutions. Gartner’s MQ sets a high standard and earning a place on their grid is a great affirmation that Acronis is delivering on our mission to protect all data, apps, and systems.

Author

Commented:
So I understand:
dd if=/dev/sda of=mbrsda bs=512 count=1
*if - reads the device (/dev) on the first mass storage device (sda)
*of - writes the contents of /dev partition to a file called mbrsda
*bs - set the block size to 512, since the mbr record is 512bytes in size
*count - ???not sure???

Questions:
Where is mbrsda written?  How would I see it?

dd if=/dev/sdb of=/dev/sda bs=512 count=1
*if - reads the device (/dev) on the second mass storage device (sdb)
*of - writes the contents of /dev partition on second device and writes it to the partion of the first device
*bs - set the block size to 512, since the mbr record is 512bytes in size
*count - ???not sure???

I would assume that if dev/sdb was blank, then it would overwrite the existing good mbr with nothing.

Assuming something goes wrong down the road.  How would you restore the mbrsda that was saved?

Author

Commented:
ok, trying to wrap my head around this:

$ grep /dev/md /etc/fstab
# /dev/md0

$ df -h / /home
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              19G  2.7G   15G  16% /
/dev/md0              128G   43G   79G  36% /home

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda3[0]
      134801344 blocks [2/1] [U_]
     
unused devices: <none>

$ sudo mdadm --query --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Tue Jul  8 19:18:47 2008
     Raid Level : raid1
     Array Size : 134801344 (128.56 GiB 138.04 GB)
  Used Dev Size : 134801344 (128.56 GiB 138.04 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed May 18 21:34:32 2011
          State : active, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 0553725a:cf84abcf:68ceeaf4:7264baec
         Events : 0.48058

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed


So, I think, I have a single drive with one RAID1 partition (md0)  
The second drive is obviously missing.
I am going to install a new, unformatted drive into the box.

This should give me something like:
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda3[0]
      134801344 blocks [2/2] [U_]

I would then run
sudo mdadm --add /dev/md0 /dev/sdb3

or would I use this command?
sudo mdadm --manage --add /dev/md0 /dev/sdb3

What is the difference?
Let me know if I am missing anything.
Kerem ERSOYPresident

Commented:
> *count - ???not sure???

How many 512 bytes will be read . Without it you'd copy whole disk!! Since mbr is at the first sector (512 bytes)  this is transferring only the very first 512 bytes.

> Where is mbrsda written?  How would I see it?

This is a file that will be created in the currently logged directory.

> I would assume that if dev/sdb was blank, then it would overwrite the existing good mbr with nothing.

I don't get what you mean was "blank" ? Since it could boot from it it is not obviously "blank"

Your RAID is consisting of /dev/sda3 (HD 1) and /dev/sdb3 (HD 0)

From what you told so far:
- Your system could boot from the HD 0 only (/dev/sdb)
- Your system can not boot from HD 1 (/dev/sda)

my dd commands was just to fix this.

Your other problem is your md volume is missing a partition. So you can add it wit ha command :

mdadm --manage --add /dev/md0 /dev/sdb3  


Or let the sytem fix it:

mdadm --scan --assemble

After both commands your md volume will be reconstructed on the newly added drive.

You can watch the completion percentage by catting /proc/mdstat

Cheers,
k.


   

 

Author

Commented:
I am putting a brand new drive into the box.

"I don't get what you mean was "blank" ? Since it could boot from it it is not obviously "blank""
I mean if it is done backwards, the drive with no MBR writing to the good one.

Kerem ERSOYPresident

Commented:
So you've actually replaced the drive ? and you still have HD 0 with actual MBR ?

In fact it seems that you did not have a problem with physical drives.

Author

Commented:
Yes, new drive and HD0 boots fine
I understand that the drive is ok, but I needed a drive to clone to make sure that I have a good backup.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial