Need Design Document for AlwaysOn

I've been tasked to prepare an architectural design for Always-On high availability both in clustering windows and sql server 2012 for a 24/7 operation.

We are in the process of designing our enterprise for 24/7 operation with no or limited downtime.

Can somebody provide me a sample design document or point me to the right direction?
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

FavorableAuthor Commented:
Our environment is VMware ESX on Netapp Storage.
OS: Windows Server 2008 R2 Enterprise  SP1
SQL Server 2012 SP1
Mark WillsTopic AdvisorCommented:
There are whitepapers and design considerations on the Microsoft website - have you checked them out, or, are you after something different ?

Some of it will depend on the edition you have. Whilst standard supports basic levels of "AlwaysOn" it is the enterprise version that truly extends the philosophies into a more robust environment... To understand the differences have a quick read of :

There are also a few blogs via TechNet and videos via youtube...

For Blogs have a read of :  and the team blog (lots there)
For whitepapers see : (you need to download the actual paper) but also note on the left hand side list of topics a couple of other entries...

For MS sites, a lot of the HA discussion starts at : and has links - for example the latest "buzz" in HA in SQL2012 is AlwaysOn + Availability Groups

And of course, there is the "home page" for high availability :

So, lots of reading...
FavorableAuthor Commented:
I can tell it's going to require lots planning.
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

Mark WillsTopic AdvisorCommented:
*laughing* yep. MS always seem to make it sound easier...

Make sure you have your own Disaster Recovery (DR) plans clearly defined in plain English and get management sign off first. That should cover how long a delay you can sustain in a DR situation.

Also, have clearly defined Service Level Agreements (SLA) with the user population for uninterrupted service supply. Make very sure you include some time for scheduled outages for housekeeping purposes. Would also suggest the inclusion of "test drills" where you actually try some of the failover techniques.

You have a good start with Enterprise edition of windows. Next is to double check what level of service you need to support for SQL Server. You *might* need to go Enterprise for that as well.

That's why you need to first have those plain English DR and SLA documents, and then decide the best way to fulfil those requirements and then decide the correct licenses.

Or, just get Enterprise (both Windows and SQL) and have the more advanced options always available.

We see a lot of people buy licenses and then decide, or, be limited (and frustrated) by the license. Always best to plot, plan, get agreements, then start the nitty gritty.
Gerald ConnollyCommented:
I hope you have deep pockets!!!!!

As Mark said, make sure you have the SLA's really tied down before you start.

If its really got to be a 24/7 with little or no downtime then its going have to be a split site cluster with everything redundant and everything dual-pathed - think of all those Disaster Scenarios - Power Failure, External Flooding, Internal Flooding, AirCon Failure etc etc etc.

Think right down to smallest detail, e.g is it a single dual-port hba or two single-port ones?

The more paranoid would use not only dual-fabric SANs but two Dual-fabric SANs from different vendors

and the list just goes on and on
FavorableAuthor Commented:
I can tell you all how much I really appreciated all the suggestions.

Please, whatever you think I need to know to design this solution, please help make it available.
Mark WillsTopic AdvisorCommented:
Well that is a BIG task, and not sure we can do that within the scope of a question. There are sooo many things to consider.

For example...

If you want next to zero data loss and automatic failover then you have two choices
1) AlwaysOn with Availability groups + Synchronous Commits
2) Database Mirroring + Synchronous synch + Witness

But they have big impacts on bandwidth and physical network (storage) design

So, maybe asynchronous might be a viable option. Except there is no automated failover.

If you want automated failover then there is one more choice to the two above and that is
3) AlwaysOn failover cluster instance

Now, the SQL failover clustering does depend on correct configuration of Windows Server failover cluster (so check that first), and, for the first time SQL2012 introduced / accommodates  TempDB on local disk in failover configurations (that's huge improvement for IO).

Then on the hardware side there are choices for shared drives and NAS has been shown to be slow compare to direct attached SSD drives - very important consideration for latency especially if going asynch.

You really need to understand some of the choices - or (more accurately) limit the choices according to your SLA's and DR plans. And it is not an easy exercise for the uninitiated.

Remember that whitepaper from the MSDN site (prior post) and I said to look at the left hand side ? Well, if you did, you will find AlwaysOn Architecture Guide: Building a High Availability and Disaster Recovery Solution by Using Failover Cluster Instances and Availability Groups

There is a ton more reading but not much help unless we know "scope". For example, you might find : of benefit but there are a lot of links, one or two are broken (well, not broken as much as  superseded) but it is a great resource for getting your head around HA and DR.

Then there are SQLPASS powerpoints discussing the problem and some case studies. e.g.

So, unless there are specific questions or you can provide more background as to your research, then there isn't a huge amount we can do in terms of designing a solution for you, the risk of suggesting something and it fails to meet your requirements is not a good position to be in. Alternatively, suggesting the absolute best practise would be cost prohibitive.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
FavorableAuthor Commented:
Mark, I cannot thank you enough.
You have taken your time to give ample information as a guide, I appreciate it.  Not to mention the links.

When my company tasked me with designing the document,  somehow, I had no clue on where to start. Immediately,  this forum came to mind. I'm glad to belong here to see how questions are given prompt attention.
FavorableAuthor Commented:
Mark did a fantastic job for me and I really appreciate him
FavorableAuthor Commented:
Thank you so much!!!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.