Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 336
  • Last Modified:

HDD becomes read only

Dear Experts,

We have some servers hosted by Peer1, and those servers hold our web-based database program.
Yesterday, one of the servers went down.  They were able to bring it back up in about 2hours, but this disrupted our operation severely.
When asked for the cause, this is the explanation I got.

" The system went into read only.  This doesn't appear to be from the RAID array but most likely a system error that protected you file system.  Its best that you have the RAID one and this can prevent a drive failing and loosing the data on the server.  

The following is the output from the system and the drive hours are not brand new but they are acceptable and I am confident in the hardware on your solution.  

Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       8124

Power_On_Hours          0x0032   082   082   000    Old_age   Always       -       13388

u0    RAID-1    OK             -       -       -       232.82    Ri     ON

p0    OK             u0   233.81 GB SATA  0   -            WDC WD2503ABYX-01WE
p1    OK             u0   233.81 GB SATA  1   -            WDC WD2503ABYX-01WE     "

I don't understand why we even bother to have RAIDs if this could happen.
These servers are supposed to have brand new HDDs.
Is this something that could happen regularly, and how can we prepare for it?

Please advise.
0
yballan
Asked:
yballan
  • 2
  • 2
  • 2
2 Solutions
 
pgm554Commented:
Well ,you need to ask yourself what is high availability worth in terms of $$.

There are a lot of ways to achieve high availability 99.999% uptime(clusters and vm fail overs).

But the solutions are not cheap.

RAID has very little to do with high availability.

It's redundant hardware/software and how it's configured.
0
 
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Is the drive setup software (host-based driver configuration) or an actual RAID on Chip (hardware accelerated)?

IMNSHO quite frankly a host-based RAID setup plus SATA drives is a recipe for death.

That being said, there should be some events in the server's logs that could indicate what was happening to bring about the full-stop. If the RAID is indeed hardware based then there very well could be some logs in the controller's on board log setup.

Make sure your backups (are there any?) are good!

To answer your question more to the point: Our last SATA based server went out the door about 4 years ago. Our last host-based RAID setup went out the door around 5 or 6 years ago.

There were some very good reasons we stopped using SATA and host-based RAID:
 + Server would go full-stop if a member of the array died
 + Data would be corrupted beyond recoverability (prior to restore from backup)
 + SATA did not, and does not, have the ability to communicate problems
 + Firmware compensation (WD RE, Seagate ES) for RAID was/is flaky

Is the excuse given for the full-stop reasonable? Given my own experience how else is one going to explain the full-stop that is virtually impossible to explain?

If this setup is so critical then it may be time to look at migrating to something more robust as suggested above.

Philip
0
 
yballanAuthor Commented:
Dear pgm554,  
Thank you for your response.  As I am still living in the RAID SATA world, I am not familiar with what you are referring to as " high availability " solution.  If I wanted to still have a hosting company to host our servers, does this mean that I need to look for a company that offers redundant HW/SW?    

Dear MPECSInc,
Thank you for your response, now I realize that our hosting company is using an older technology.  I am not quite sure what you mean by "something more robust as suggested above".  Are you referring to  redundant HW/SW?    Are there any services you recommend?

To Both Experts,
I am clearly not up to speed in this subject, so I would appreciate any recommendation/guidance.
Thank you!!
0
Easily Design & Build Your Next Website

Squarespace’s all-in-one platform gives you everything you need to express yourself creatively online, whether it is with a domain, website, or online store. Get started with your free trial today, and when ready, take 10% off your first purchase with offer code 'EXPERTS'.

 
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Yes, I am referring to a cluster setup or at least a more robust server platform.

One needs to keep in mind that once equipment gets into the four year to five year life range that failures are exponentially more likely to happen. Tier 1 charges substantially for warranties into this timeframe for a reason.

Philip
0
 
pgm554Commented:
" high availability " 

I will try to be brief.

If you can cluster the app,having two servers that can service the app in case one goes down.
Usually in the M$ world,a primary and a backup with a shared piece of storage where the DB would reside.
If the primary goes down,the cluster software brings the secondary server on line to service the DB.

I have done quite a few in my pro career and it does do the job.

You just need to factor in the costs of being offline vs the higher maintenance for the added services.

Lot's of ways to do this ,but the trend is the use of redundant virtual machines from folks like Vmware and M$ Hyper V.
0
 
yballanAuthor Commented:
Thank you both for educating me on this matter, I really appreciate it!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

  • 2
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now