Exchange 2010 DAG Witness Server - HELP

Vjz1
Vjz1 used Ask the Experts™
on
Hi,

I have a working Exchange 2010 setup here.  I have 2 sites, connected via a dedicated 30Mbps WAN connection.

I have 1 mailbox server in each site, both mailbox servers participate in 1 DAG.

Site 1 holds my Witness server.

The DAG works as expected, when everything is online.  When I have a break in my WAN connection, is where I get problems.

When the WAN goes down....

Site 1, with the witness server, acts as expected.  The local DB stays online, and the replicated DB mounts....as it should.

Site 2, without the witness serveris where i have the issue.  The replicated DB from site one goes into a 'services down' state, as expected. HOWEVER, the primary DB on the server, goes into a "FAILED" state, and stops service for that DB.

I assume this is because both the mailbox server in the DAG, and the witness server are unavailable?

Well I can only have 1 witness server, so it has to be somewhere.  Wherever it is, the other site loses it's DB's, during an outage.  This really stinks.

What can I do about it?

I have thought of creating 2 DAG's, and each site having a local witness server.  The only issue is that each Mailbox server can only be in 1 DAG, so I'd have to have another server at each site, just for this purpose.  Licensing alone, plus resource utilization makes this a clunky solution.

I also found this online, it talks about "alternate" witness server;

Set-DatabaseAvailabilityGroup -Identity DAG1 -AlternateWitnessDirectory C:\DAGFileShareWitnesses\DAG1.contoso.com -AlternateWitnessServer EXHUB3

However, I put this into my config, and I get the same results.  Not sure if this command is supposed to do what I want it to do, but by the terminology it would seem so, but maybe not.

What can I do about this?  having my local DB go into a FAILED state just because the WAN is out, is a terrible setup.

Any ideas????
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Rajith EnchiparambilOffice 365 & Exchange Architect

Commented:
A two-member DAG does not enable the system to distinguish between a single server failure, a multiple-server failure and a site failure. In addition, you must use the Windows failover cluster management tools to manage a datacenter switchover for a two-member DAG that is extended across multiple datacenters.

Author

Commented:
ok well let's say I have 3 sites then.  this is my actual situation, and the question i posed, is from my lab testing.  in real world, i'll be deploying to 3 sites.  so let's consider this;

Site1 - MBX1 server and Witness server on HUB1
Site2 - MBX1 server
Site3 - MBX1 server

All sites are connected via a MPLS 30Mbps WAN.

If the WAN goes out, and all 3 sites are isolated, how do I prevent the DB's in site 2 and 3 from going into a "failed" state?
Success in ‘20 With a Profitable Pricing Strategy

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Commented:
Hi,
Make sure you are going through this as well, which would definitely help you to understand what Rajith was trying to tell you.
Server 2010 Planning for High Availability and Site Resilience
http://technet.microsoft.com/en-us/library/dd638104.aspx
Thanks,
Milikad

Author

Commented:
ok i have a crazy idea.

all my server rolls are each on their own server; i.e. Hub, CAS, and MBX are each on their own windows server.

what if I installed the mailbox role, on each site's Hub Transport server, included that in the DAG.  Then each site would have a majority rule in the event of a WAN outage.

My primary mailbox server would detect the outage, the other mailbox server in the DAG (locally) would confirm that the other side is down, and we're up, and the DB should stay mounted.

Anyone see any problem with this strategy?

I understand split brain syndrome, but I am not utilizing automatic failover, i have put in the switch so that they will never mount, i will do it manually, thus I will control split brain.

I think this may work.  Anyone have any thoughts on this strategy?
Commented:
ok no this idea didn't work.  Having 2 mbx servers per site, does not solve the 1 witness server issue.

this is a huge huge issue, and i hope MS addresses this in a future service pack.

this is such a drawback, that i think it's a showstopper for me.  there is no way i can justify using DAG, if all my databases, that are in different sites then the witness server, go offline during a WAN blip.  that is just terrible.

if anyone has any other ideas, please let me know.

the only way i can think of doing this now, is having 3 seperate DAGs, 1 for each of my 3 sites, and having a local FSW for each location.  this means i have to license more enterprise windows licenses, more exchange licenses, and for every new office i open, i have to add another server in my main location.  that is not scalable.

i'm just out of ideas.  how are people using this facility and getting around this severe limitation??
Just a note - the witness server is only required in even number node scenarios. Above you proposed having a 3rd datacenter site. So, Site A would not need a witness server. Odd numbered scenarios are able to properly negotiate ownership.

I'm working on the same scenario myself and planning on having a 3rd Datacenter that does not contain a Exchange 2010 node, but simply serves as the site/location of my Witness server. Site A --> Site C; and Site B --> Site C. These will be connected via dedicated VPN. This --should-- be an acceptable configuration.

-Brad

Author

Commented:
yes, until such time comes when your offices get "isolated" for whatever reason, then all local DB's will go offline.

if they would just put in a setting that would let me, the administrator, decide when to "protect" the db's by offlining them, my issue would be solved and my DR would become much simplier.

oh well.
its certainly a pickle.... I'm hoping to have my system online with a test mailbox shortly... I'll keep this thread updated and let you know how it works out. Hopefully I can test my Witness disk "workaround" today.

-Brad

Author

Commented:
very cool, would love to hear more experiences from people in the SMB market, who need to dual purposes their mailbox servers as both active and passive DB's, for DR purposes.

i have looked at this issue 6 ways to sunday, and there is no good answer.  either don't dual purpose your machines, or add the mailbox role to other machines, and create seperate dags.  it's ugly.

in the end, i have decided to just use 3rd party replication technology, and in the event of a DR, i can mount that volume on my remote site mailbox server, and run the script that "moves" the mailbox location in AD, to the new server.  it's 2 steps, and not 1 like with dags, but until DAG is adjusted to be more SMB friendly, it's the best solution i have at the moment.
I've run into the exact same situation. I'm thinking of using DFSR for the file share witness. Maybe it will work

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial