Question

Yellow lights on, new disks to fail... Is that usual on a two years old brand new equipment?

Asked by: JoseParrot

Hello experts,

My company installed CX500 + 2xPE6850 + 2xBrocade + 4CPU Powerpath in 2006.
That time only for a consolidadtion of 5 MS SQL Servers, total of 3 trays with a couple of LUNs, all of them in RAID-5.
Last and during present years, a number of other things are stored there.
Follows a sample list:
- 380 GB SCSI R-5 with Veritas File System under Solaris 8
- 400 GB SCSI R-5 for Oracle RAC under Solaris 11
- 400 GB SCSI R-5 with NTFS for Notes under Windows 2003
- 400 GB SCSI R-5 with NTFS for SQL, used by Veritas Replicator, under Windows 2000
- 200 GB ATA no RAID for Red Hat Linux
- 500 GB ATA R-5 for VMWare
- etc, a very heterogeneous environment

There are servers, accesing such SAN, from single Xeon to 4x Dual Core Xeon to 4x RISC Sun and 2x PA-RISC.

People around is criticizing such macarronade as the worst possible combination to cause bad performance and erroneous behavoir. Of course I'd like to have all my systems running the same O.S. and the same DBMS, one system per tray... but world isn't as we want...

The problem, and I hope the experts can have a hint or a guess, or even an experience by his/her own, is that erroneous functioning seems to ponit in favor of the critic people.
In the last 30 or 40 days, every week yellow (orange?) lights turn on, one disk die, the spare assumes, Dell people change the malfunctioning disk and the cycle repeats again and again.

What are probable causes and how to repair the system?

Jose

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2008-05-19 at 13:46:13ID23415130
Tags

EMC

,

Clariion CX500

,

CX500

,

SAN with 2 x Dell PowerEdge 6850 + 2 x Brocade SW4100

Topics

Storage Technology

,

Hard Drives & Storage

Participating Experts
2
Points
500
Comments
8

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. Red Hat 2.1 backup with veritas
    Does anyone know how well Veritas backs up Red Hat? I am running red hat 2.1 server and am interested into purchasing new backup software but can't seems to get any answers on how well Veritas does this, does anyone have any experience with this? and know the compatability b...
  2. veritas
    I have 2 host, veritas is installed on both host... 1) HP-UX if i see the device /dev/rdsk/c15t3d7 ............... /dev/vx/rdmp/c15t3d7 there is a pair of device withsame name, one contains /rdsk and other contain /vx/rdmp what is the difference 2) Solaris host if...
  3. Deploying Solaris on vmware
    Is it possible to emulate a sparc platform in Vmware server and deploy Solaris 10 on that?
  4. Veritas Vmware virtual installation
    Hi, I am new to veritas . I want to install solaris and veritas using vmware. Could you please guide me in the right direction.

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: connollygPosted on 2008-05-20 at 02:32:22ID: 21604541

Jose,

What do the disks fail with?
Are the disks that fail the SATA ones?
What do the Dell engineers say about the failures?

 

by: rindiPosted on 2008-05-20 at 08:31:59ID: 21607147

2 Years old isn't brand new, in IT that is almost ancient. Disks can always go bad, it is one the most sensitive pieces of hardware in a Computer, as it has moving parts. I'm not saying that it is normal, but it happens. Very often failures are caused by bad cabling, termination, or bad power supplies, though.

 

by: JoseParrotPosted on 2008-05-20 at 08:45:03ID: 21607290

The failing disks are SCSI 146GB FC 15.000 rpm.
In the first incident, Dell support simply exchange the failing disk at Bus0 Enclosure 3, in hot swap.
In the second time, they did the same in Bus0 Enclosure 2 and open a ticket at their EMC support, because they don't know the root cause.
In the 3rd occurrence, they scaled to EMC, which will do the same (tomorrow), that time asking for a total shutdown.

I don't know about any software fixes, but they spent a time in checking configurations. For my surprise, we discovered that the "firmware" runs on a modified Microsoft Windows...

I have noticed that the two enclosures which have presented problems have a raid with disks from both enclosures, as in the figure below: RG18, from Enclosure 2, has a disk at Enclosure 3. Is the problem to occur right in these enclosures a mere coincidence?

Jose

 

by: JoseParrotPosted on 2008-05-20 at 09:08:20ID: 21607578

rindi,

Dell people didn't reported anything wrong with the components, except the failing disks.
In the past we had 4 Sun StorEdge T3. All had failures and malfunctioning disks. I think Sun customers around were very happy, because I was responsible for the bad part of the average MTBF.

Despite the moving parts, Sun claims better MTBF to disks than to HBA, as per the below table:
Component      Sample MTBF (hours)
HBA 1             800,000
Link A             400,000
Concentrator 1       580,000
Disk A             1,000,000
as in the paper at http://www.sun.com/blueprints/0602/816-5132-10.pdf.
Probably due to the connectors and cabling...
I don't know the same data from Dell.

Jose

 

by: connollygPosted on 2008-05-20 at 23:32:59ID: 21612530

Jose,
But the MTBF of the whole system is much lower than the MTBF of the individual parts, eg for disks IIRC you divide the number of disks by the MTBF of one disk to get the actual number for the whole set of disks, which means for those people with a large number of disks - 500 or so they can expect at least one disk to fail every week (even more if they are SATA that are run above their rated duty-cycles (normally about 40%))
G

 

by: JoseParrotPosted on 2008-05-23 at 07:20:07ID: 21632347

connollyq,
Thanks for your comment.
You're right. My understanding from the Sun paper, was those MTBFs to be for components. Another article from HP shows the same 1,000,000 hs MTBF for SCSI disks.

What I was suspecting is that Dell CX500 disks weren't the cause of the failures. My guess in Thrusday was that Dell (and then EMC) tech people made wrong diagnostics. Yesterday they confirmed my guess... after to substitute the "defective" disks, the problem appeared again, that way in other disks, and they decided to substitute one of the trays. They are working on right now.

I would appreciate if you can help me in clarifying the central question: is such "salade" (mixing of RAIDs in the same enclosure, RAIDs with disks in two different enclosures, extreme heterogeneous environment: Linux, Windows Server 2000, 2003, Sun Solaris 10, VMWare, all accessing the storage) a enough reason to cause instability or malfunctioning of our CX500?

Once again, thank you all for the attention.

Jose

 

by: connollygPosted on 2008-05-24 at 00:16:04ID: 21638146

Jose,
There is nothing really wrong with the "salade" that you have, its just that it may not be the optimum layout for performance and/or availability!
In general ts normal practice to spread a RAIDset across the enclosures, for performance (depending on the number of back-end channels on the RAID controller) and also for availability (if you have three enclosures and a three disk RAID-5 set with one disk in each enclosure, you could survive a enclosure failure [not a very common occurence though])
But if it is working now and you are happy with the performance i would advise caution in doing any wholesale changes.
G

 

by: JoseParrotPosted on 2008-05-27 at 22:24:16ID: 21657858

Thanks for the advisoring.
Last week-end EMC tech people have substituted more disks, updated firmware and other configurations adjusts. Seems to be OK.
Actually, even after the failures sequence, no systems were crashed nor a single byte was lost.

Thank you,
Jose

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...