Solved

A whole iSCSI SAN failure ?

Posted on 2013-02-07
2
329 Views
Last Modified: 2013-02-09
Hi,

Have anyone experienced a whole SAN failure due by any reason ?
The general thinking is that "anything can fail" makes a person uncomfortable with
having everything in a single SAN.
I was told to double a SAN which means having two identical SANs with some sorts of replication to protect data in case a whole SAN failure.
In each SAN, I also have multipath IO, dual controller, dual switches...etc ?
Is it a overkill ?

Thanks for any idea out there.
0
Comment
Question by:nothienthu
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 4

Expert Comment

by:tpitch-ssemc
ID: 38864008
Everything you listed I would call normal. I'd make sure you also have multiple arrays with hot spares at a minimum. Budget depending, I'd also get another SAN and replicate everything over for your DR plans. I will not run a SAN unless I have 2 controllers connected to different switches.

For example, I had an EMC VNX that had 2 controllers with 4 iSCSI NICs each. Of the 4 iSCSI NICs I would have 2 going to switch A and 2 going to Switch B then I would do the same thing for the other controller. That way if I lost a switch I would only lose 50% of our paths.

Of course 2 SANs are better than 1.
0
 
LVL 10

Accepted Solution

by:
millardjk earned 500 total points
ID: 38867714
Yes, whole array failures occur. It's typically unrelated to the hardware, but instead due to some unknown bug in the firmware that causes catastrophic data loss.
In those scenarios, they often trigger cascading data loss in replicated arrays (garbage in, garbage out), and the only thing saving the business using them is a good set of backups.

Those tend to be doomsday scenarios, however. You can read about them occurring, however, so it is something to assign a level of risk when designing your failure scenarios.

So yes, consider duplicate SANs as one mode of risk avoidance; having duplicate connectivity gear and multiple paths is another; multiple hosts running hypervisors and UPS backed by generator are yet others. It all comes down to how much cash you can afford to spend on them, and whether there is additional value (like providing more capacity or performance) beyond that of eliminating a single point of failure.
0

Featured Post

 Database Backup and Recovery Best Practices

Join Percona’s, Architect, Manjot Singh as he presents Database Backup and Recovery Best Practices (with a Focus on MySQL) on Thursday, July 27, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7). In the case of a failure, do you know how long it will take to restore your database?

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Facing problems with you memory card? Cannot access your memory card? All stored data, images, videos are lost? If these are your questions...than this small article might help you out in retrieving your lost or inaccessible data.
Learn how the use of a bunch of disparate tools requiring a lot of manual attention led to a series of unfortunate backup events for one company.
This tutorial will walk an individual through locating and launching the BEUtility application and how to execute it on the appropriate database. Log onto the server running the Backup Exec database. In a larger environment, this would generally be …
This tutorial will walk an individual through locating and launching the BEUtility application to properly change the service account username and\or password in situation where it may be necessary or where the password has been inadvertently change…

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question