Link to home
Start Free TrialLog in
Avatar of B18c5teg
B18c5teg

asked on

Site resiliency, Exchange DAGs w/2members

Hi,
I've got a client that has two offices. Site A is the primary location where approx 140 employees work. Site B has approx 25 employees. There is currently a single Exchange 2010 server at Site A. The client wants a site resilient setup, wherein, if Site A goes offline (ex. power outage), Exchange is still operational and users with either Outlook clients at Site B or Internet users via OWA can still access email or users from Site A can travel to Site B and access ail from their laptops. There are 4 total mailbox databases on the current single Exchange server that holds all roles.

They have a couple extra servers that aren't being used. We had preliminarily thought of creating a 2 member DAG, both Exchange servers would hold Mailbox, HT, and CAS roles. A witness server would be one of the servers at Site A (I think). The servers have Windows Server 2008 R2. I'm unsure, but think Exchange is of the Enterprise flavor.

We are going to discuss using VMWare for high availability, but using the built in features of DAGs should help in the interim. We can also choose to retain the DAGs when/if VMWare is employed.

A few questions:
1) Can a site resilient setup be accomplished by using only 2 Exchange servers and a single witness server?
2) If so, during failover, how can we setup Internet DNS records to allow users to still access the same OWA page they're used to, for example, webmail.company.com?
3) Same assumption, but could users from Site A drive to Site B ("failover" site) and connect to Exchange via Outlook? All users, BTW, use Outlook 2010.
4) If a 2 member DAG won't work for site resiliency, how may servers are needed?
5) Is failover automatic with DAGs during an outage at Site A, the primary site? How does Exchange know which is the primary site? Is failback automatic? If none of it is automatic, can you point me in the direction of online info for educational purposes?

I know these are a lot of questions to answer and I very much appreciate the help I can get!!
ASKER CERTIFIED SOLUTION
Avatar of Jamie McKillop
Jamie McKillop
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
To run a DAG, you need Windows Server Enterprise, you do not need Exchange Enterprise.

yes, all you need is two servers for Exchange and a domain joined server that is not a domain controller to hold the witness share, the witness does not participate in the cluster, it just holds the share that the two servers use to create a quorum.

You need sufficient bandwidth between the two sites for the replication traffic, latency is also a key factor here.

to have the capability of transparent failover, you will need to create a CAS array between the two CAS roles. a CAS array needs to have both servers in the same AD site.

when using a CAS array, you also need to have a load balancer, I frequently use an open source load balancer called HAProxy, but there are also a multitude of comercial offerings available.

to have site resiliency, you need a load balancer at both sites and "active" DNS from a DNS hosting provider, active DNS is where the DNS provider polls the hostname (hopefully using a HTTPS connection rather than just ICMP) and if there is no respnse, changes the A record for the hostname to a pre-arranged alternate IP address.

if you lost the entire site, you would need to bring up a new witness server at the second site before Exchange at the second site would become live., this is because it needs a quorum to ensure that it is not operating with a "split brain"

failover can be automatic, switching back is manual
Avatar of B18c5teg
B18c5teg

ASKER

Great comments here.

Is it possible to have a witness server at both locations, or just at the primary?
you can have a secondary witness server, but it has to be manually activated
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
If both servers are in the same AD site failover is automatic, however if that AD site spans physical sites and you don't have "private" inter-site connectivity then you need external active DNS.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
jut to clarify

the members of a CAS array have to be in the same AD site

an AD site can be across multiple physical sites
Absolutely, just to clarify the CAS array setup I outlined above.  If you are only going to have a single CAS at each site for starters you should still create a CAS array to future proof yourself for adding additional CAS servers or to repoint the CAS array to the other site should the CAS at Site A/B fell over.

Providing that you have Exch 2010 SP2 RU3 plus on your Exchange boxes then Outlook should adjust itself automatically to the CAS array when you move the mailboxes.

Site A

CAS array - sitea.abc.local
CAS array IP - 10.10.10.1
CAS server - cas-sitea.abc.local
CAS server IP - 10.10.10.1

Site B

CAS array - siteb.abc.local
CAS array IP - 10.10.20.1
CAS server - cas-siteb.abc.local
CAS server IP - 10.10.20.1

Also worth noting that the CAS array name does not need to be on your SSL cert.
You need to be aware that active/active dag setups are NOT link failure proof.

what i mean is that the link between the 2 sites goes down the Site B will go down no matter what
That might not be a problem. Users at Site B will access their mailboxes via the Exchange box at Site A. Maybe I should do an active/passive setup instead of active/active?
You can have them active/active if you want, I just wanted to clarify that, in case of a link failure, site B exchange server will go down so users won't have access anymore to their emails no matter if they were active in site A or site B

other than that I guess all the questions were answered
My understanding is that if you have an active/active DAG with FSW and Alternate FSW with the preferred first server set at each site then if the link between sites goes down each site has majority vote in its own site for its quorum.

Therefore Site A and site A databases stay up with the passive copies offline for site B, and site B and its databases will be online and the passive copies of site A will remain passive.  
No split brain

As both sites are internet facing they will continue to send mail independently and will resync once the connection comes back.
No split brain

If you lose the connection to just the mailbox server in site A then the site B database will come up as the FSW will initiate the switchover.  You can then manually switchback when the mailbox server comes back....
No DLeaver this is not the way it works, the alternate share witness is only used in a datacenter switch over and should be manually activated.

If the link goes down the site with no share witness (primary share witness that is) will loose quorum and databases will be dismounted no matter what. DAG does not take into consideration link failure
Fair enough, I stand corrected on site link outage

I think the decision on whether you choose an active/active is based on whether both sites are internet facing and the distribution of the mailboxes.  In an active/passive setup then all of the users in site B will be accessing their MB's and sending/receiving from Site A

Consider starting active/passive as you can move to active/active at a later date is user numbers in that site increase
Active/passive is a better setup and here is why:

If you are using active/active and your wan link goes down, the databases at site B will dismount and you will need to initiate a manual site failover. This will result in some downtime for users at site B. if you are using active/passive and your wan link goes down, there is no downtime for users at site B, assuming your Internet line is still up.

Exchange 2007/2010 was designed with the idea of consolidating Exchange servers in a central data center. There is no longer a need to have Exchange servers at each site, especially when using cached mode. You don't really gain anything with an active/active setup but you do lose some fault tolerance.


JJ
Active/Active or Active/Passive is exactly the same in terms of users experience in this case, in both cases users in site B will go down

1. Chances are that Site A to Site B connection is VPN over internet anyway so "link" is somehow the same than VPN
2. assuming they are not and link goes down users will be disconnected since they were connected over VPN and need to switch to outlook anywhere
Just to go back to the original question, as you have four existing databases on a standard Exchange setup you will be limited to 5 databases which includes any passive copies.  Something to consider if you were to stay with standard.

Active/passive - you will be fine as the database numbers will be equal
active/active - you will have to move two of the databases to be active at site B in order to ensure you don't go over the database limit...
The link between the sites is a 30Mbps MPLS connection.

Right now, since there's only a single Exchange box, users in Site B (the smaller site, and what is going to be designated the "failover" site) access their mailboxes from the Exchange box at Site A.

If we're limited to 5 database copies by using Exchange 2010 Standard Edition, and we've got 4 now (2 for staff, 1 for management, 1 for owners) I suppose we'll either have to consolidate the 4 DBs into fewer - maybe 2 OR utilize Exchange Enterprise Edition. If in the case of the latter, I'm thinking we'll need to install the new secondary Exchange 2010 box using Enterprise Edition, move all databases to this server, then reinstall Exchange Enterprise on the first server, moving all databases back afterward. Maybe I'm off base on that.

If I'm reading the above posts correctly and have understood Technet articles correctly, can I keep the 4 databases on the Exchange server at Site A, keep all users accessing their mailboxes on this server, then create the DAGs so that Site B's Exchange server is a passive copy, only activated if Site A goes offline for an extended period? If this is a manual process of switching over, activating a new Witness Server, pointing external DNS records to failover site, then I'm OK with that - it'll become part of our D.R. process.

If what I've outlined is not reasonable or possible, then maybe we should consider changing the way users access their mailboxes and create Site-specific (office location-specific that is) mailbox databases instead of the way it's setup now by functional level of employee. Then have each server at both Site A and B hold a passive copy of the other's mailboxes.
1) you can upgrade from 2010 standard to enterprise by simply changing the key if you want

2) yes you got it pretty much right you will need to do a datacenter switch over if you want to go up in site B
You need the enterprise version of Windows, so you will have to rebuild.

JJ
JJ, we've got Windows Enterprise OS, but I think Exchange Standard.
doesn't matter you can upgrade to enterprise if you have the key
Upgrading the OS after Exchange has been installed is not supported.

JJ
i am talking about upgrading exchange standard to enterprise