Avatar of bmsjeff
bmsjeffFlag for Afghanistan

asked on 

ubuntu RAID issue - Software - different data on drives

I have no Ubuntu experience, so any help would be grateful.
I was getting this error.
/dev/sda1 clean
fsck 1.40.8
/dev/md0 contains a file system with errors, check forced
Duplicate or bad block in use!
/dev/md0: Multiply-claimed block(s) in inode 8: 24576
/dev/md0: Multiply-claimed block(s) in inode 4981200: 24576

I ran:
df -h
/dev/sda1      19g      749 used      17g available
/dev/md0      128g      27g used      95g available

cat /proc/mdstat
md0 : active raid1 sdb3[1] 134801344 blocks [2/1] [_u]

I ran DLGDIAG on both drives and they are ok.
There was a bad stick of memory that has been removed.

I  run Live CD on Drive0
fsck
Errors corrected on Drive0
Drive Boots properly

Live CD on Drive1
fsck
Receive error "No usable shell was found on your root file system" on Drive1

When I look at Drive 0, there is no data on it new than 5/2/11
The data was there before I pulled the drives.
I assume this is a problem with the RAID

I ran DiskInternals and saved the Drive1 data to my Windows PC

When I try to boot to Drive1 it just sits there, nothing

What steps should I take?

If if run this on Drive0 alone I get
cat /proc/mdstat
md0 : active raid1 sdb3[0] 134801344 blocks [2/1] [U_]

Could Drive1 be a grub problem?  Not sure what steps to take next.
Linux

Avatar of undefined
Last Comment
bmsjeff
Avatar of bmsjeff
bmsjeff
Flag of Afghanistan image

ASKER

I have placed the drives back into the box and all my folders are there.
Drive0 seams to be the problem
Drive1 appears to have the data, although it will not boot.

sudo mdadm -D /dev/md0
version 00.90.03
Raid Level: raid1
Array Size: 134801344 (128.56 GiB)
Raid Devices: 2
Total Devices: 1
Preffered Minor: 0
Persistence: Superblock is persistent
State: clean, degraded
Active Devices 1
Working Devices 1
Failed Devices 0
Spare Devices 0
Events 0.1614

Number Major Minor RaidDevice State
0             8         3          0           Active sync /dev/sda3
1             0         0          1           removed


sudo mdadm -E /dev/sda1
Checksum: f7866587 - correct

           Number Major Minor RaidDevice State
this         0             8         3          0           Active sync /dev/sda3
               0             8         3          0           Active sync /dev/sda3
               1             0         0          1           faulty removed

Where do I go from here?
Avatar of Kerem ERSOY
Kerem ERSOY

Hi,

First of all when you have 2 drives only the first drive will have the boot record. This is why your drive0 boots while the Drive1 can't. Bit you can transfer the MBR to second drive with a command below:

dd if=/dev/sda of=/dev/sdb bs=512 count=1


This is being said now lets come to you problem. It seems that though you have a raid 1 array, it seems that one of your drives belong to was dropped from the RAID and outdated since.

Number Major Minor RaidDevice State
0             8         3          0           Active sync /dev/sda3
1             0         0          1           removed


This shows that your volume is running with only one drive.

So since it is actually intact but outdated we need to re-add this drive to the raid with a command such as this:

mdadm --manage --add /dev/md0 /dev/sdb3  

Since the missing volume name is: /dev/sdb from the info you've provided :

cat /proc/mdstat
md0 : active raid1 sdb3[0] 134801344 blocks [2/1] [U_]


After you've re-added the drive, check the output of /proc/mdstat and make sure that the array is rebuilding.

Cheers,
K.




ASKER CERTIFIED SOLUTION
Avatar of Kerem ERSOY
Kerem ERSOY

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of bmsjeff
bmsjeff
Flag of Afghanistan image

ASKER

So I understand:
dd if=/dev/sda of=mbrsda bs=512 count=1
*if - reads the device (/dev) on the first mass storage device (sda)
*of - writes the contents of /dev partition to a file called mbrsda
*bs - set the block size to 512, since the mbr record is 512bytes in size
*count - ???not sure???

Questions:
Where is mbrsda written?  How would I see it?

dd if=/dev/sdb of=/dev/sda bs=512 count=1
*if - reads the device (/dev) on the second mass storage device (sdb)
*of - writes the contents of /dev partition on second device and writes it to the partion of the first device
*bs - set the block size to 512, since the mbr record is 512bytes in size
*count - ???not sure???

I would assume that if dev/sdb was blank, then it would overwrite the existing good mbr with nothing.

Assuming something goes wrong down the road.  How would you restore the mbrsda that was saved?

Avatar of bmsjeff
bmsjeff
Flag of Afghanistan image

ASKER

ok, trying to wrap my head around this:

$ grep /dev/md /etc/fstab
# /dev/md0

$ df -h / /home
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              19G  2.7G   15G  16% /
/dev/md0              128G   43G   79G  36% /home

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda3[0]
      134801344 blocks [2/1] [U_]
     
unused devices: <none>

$ sudo mdadm --query --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Tue Jul  8 19:18:47 2008
     Raid Level : raid1
     Array Size : 134801344 (128.56 GiB 138.04 GB)
  Used Dev Size : 134801344 (128.56 GiB 138.04 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed May 18 21:34:32 2011
          State : active, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 0553725a:cf84abcf:68ceeaf4:7264baec
         Events : 0.48058

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed


So, I think, I have a single drive with one RAID1 partition (md0)  
The second drive is obviously missing.
I am going to install a new, unformatted drive into the box.

This should give me something like:
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda3[0]
      134801344 blocks [2/2] [U_]

I would then run
sudo mdadm --add /dev/md0 /dev/sdb3

or would I use this command?
sudo mdadm --manage --add /dev/md0 /dev/sdb3

What is the difference?
Let me know if I am missing anything.
Avatar of Kerem ERSOY
Kerem ERSOY

> *count - ???not sure???

How many 512 bytes will be read . Without it you'd copy whole disk!! Since mbr is at the first sector (512 bytes)  this is transferring only the very first 512 bytes.

> Where is mbrsda written?  How would I see it?

This is a file that will be created in the currently logged directory.

> I would assume that if dev/sdb was blank, then it would overwrite the existing good mbr with nothing.

I don't get what you mean was "blank" ? Since it could boot from it it is not obviously "blank"

Your RAID is consisting of /dev/sda3 (HD 1) and /dev/sdb3 (HD 0)

From what you told so far:
- Your system could boot from the HD 0 only (/dev/sdb)
- Your system can not boot from HD 1 (/dev/sda)

my dd commands was just to fix this.

Your other problem is your md volume is missing a partition. So you can add it wit ha command :

mdadm --manage --add /dev/md0 /dev/sdb3  


Or let the sytem fix it:

mdadm --scan --assemble

After both commands your md volume will be reconstructed on the newly added drive.

You can watch the completion percentage by catting /proc/mdstat

Cheers,
k.


   

 
Avatar of bmsjeff
bmsjeff
Flag of Afghanistan image

ASKER

I am putting a brand new drive into the box.

"I don't get what you mean was "blank" ? Since it could boot from it it is not obviously "blank""
I mean if it is done backwards, the drive with no MBR writing to the good one.

Avatar of Kerem ERSOY
Kerem ERSOY

So you've actually replaced the drive ? and you still have HD 0 with actual MBR ?

In fact it seems that you did not have a problem with physical drives.

Avatar of bmsjeff
bmsjeff
Flag of Afghanistan image

ASKER

Yes, new drive and HD0 boots fine
I understand that the drive is ok, but I needed a drive to clone to make sure that I have a good backup.
Linux
Linux

Linux is a UNIX-like open source operating system with hundreds of distinct distributions, including: Fedora, openSUSE, Ubuntu, Debian, Slackware, Gentoo, CentOS, and Arch Linux. Linux is generally associated with web and database servers, but has become popular in many niche industries and applications.

71K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo