Link to home
Start Free TrialLog in
Avatar of Robert Hall
Robert HallFlag for Hong Kong

asked on

Split Brain/Horizon DNS Headache

We have what I have seen referred to as a "Split Brain" DNS configuration.  Brief details as follows:

SERVER52.D1.COM      W2K8 R2 PDC (DNS server to 192.168.0.0/16). IP: 192.168.5.2/16
SERVER53.D1.COM      W2K8 R2 DC (DNS server to 192.168.0.0/16). IP: 192.168.5.3/16

SERVER55.D1.COM      W2K8 R2 Public-facing DNS Server. IP: 192.168.5.5/16 (MIP: 202.xxx.yyy.125/26)

OSNS1            Off-site secondary DNS Server
OSNS2            Off-site secondary DNS Server
OSNS3            Off-site secondary DNS Server

WWW.D2.COM            Web Server.  IP: 192.168.1.197/16 (MIP: 202.xxx.yyy.70/26)
MAIL01.D2.COM      Mail Server.  IP: 192.168.0.189/16 (MIP: 202.xxx.yyy.76/26)
WEBMAIL.D2.COM      Web mail portal.  IP: 192.168.0.191/16 (MIP: 202.xxx.yyy.100/26)

All on-site DNS servers are running MS DNS included with W2K8 R2.

SERVER55 is a domain member.  It's TCP/IP, DNS Server settings point to SERVER52 and SERVER53.  

SERVER55's DNS Server is configured with four Primary (non-AD integrated) forward lookup zones (D1, D2, D3, D4.com).  Each zone contains NS records for SERVER55 and the three off-site secondary servers.  Zone transfers are allowed 'to all name servers listed on the name server tab'.  Notifications are sent to the same list.

If I increment the serial number for a zone on SERVER55, the change is replicated successfully to the three off-site servers.  This tells me notifications and zone transfers are working properly.

All of the A records defined in the zones on SERVER55 use 202.xxx.yyy.zzz/26 IPs.  All of the A records defined in the machine domains on the two DCs use 192.168.0.0/16 IPs.

SERVER55's public IP is MIPed by the firewall to 192.168.5.5/16. A policy permits DNS traffic from Untrust to Trust. Queries are received, processed and returned.

I hope I have provided the correct level of detail above.  Drum roll please....

The Problem

I use DNS Stuff's professional toolkit (which does not used cached data) for troubleshooting. If I perform an A lookup on www.D2.COM (or mail01 or webmail..) against any of our off-site DNS servers, I receive 100% correct results (202.xxx.yyy.zzz IPs).  If I use the MMC DNS snap-in and the 'Run NSLOOKUP' option against SERVER55 for www.D2.COM (or mail01 or webmail...), the IPs returned are 202.xxx.yyy.zzz IPs.

If, however, I run the queries against SERVER55, 100% of the time I receive the 192.168.0.0/16 address that is defined in the zone of the corresponding domain name on SERVER52 and SERVER53.

Examples:

DNS Stuff Query to OSNS2            www.D2.COM            Results:            202.xxx.yyy.70
DNS Stuff Query to OSNS3            mail01.D2.COM            Results:            202.xxx.yyy.76
DNS Stuff Query to OSNS1            webmail.D2.COM      Results:            202.xxx.yyy.100

NSLOOKUP Query On SERVER55            www.D2.COM            Results:            202.xxx.yyy.70
NSLOOKUP Query On SERVER55            mail01.D2.COM            Results:            202.xxx.yyy.76
NSLOOKUP Query On SERVER55            webmail.D2.COM      Results:            202.xxx.yyy.100
(Above results are as expected)

DNS Stuff Query to SERVER55      www.D2.COM            Results:            192.168.1.197
DNS Stuff Query to SERVER55      mail01.D2.COM            Results:            192.168.0.189
DNS Stuff Query to SERVER55      webmail.D2.COM      Results:            192.168.0.191
(Not what we're looking for!)

The only thing that I can think of is that the results obtained by SERVER55's DNS Client (when querying SERVER52 and SERVER53) are being used in the responses sent by SERVER55 rather than using the data contained in SERVER55's DNS Zones.  It has always been my (perhaps wrong!) understanding that the local primary zone files are all-powerful and take precedence over the results returned by the client service on the machine.

Fortunately the public-facing DNS server zone data is quite static. To overcome the issue of private IP addresses being returned when the query is routed to SERVER55, I have stopped the DNS Server service on that machine.  Timeouts are being experienced, but at least the public is being given IPs they can access.

Is it true a W2K8 R2 machine will use DNS Client results in a DNS Server response, even if the name appears in a local DNS primary zone?  Can this priority be changed?  If so, references to knowledgebase articles or similar would be much appreciated.

Thank you for reading!  I'm happy to provide as much further detail as I can.

Regards,
Rob
Avatar of Ugo Mena
Ugo Mena
Flag of United States of America image

Your public facing (SERVER55) DNS should not have your internal (SERVER52 and SERVER53DNS) listed as its DNS servers.
Your public facing DNS server should have your ISPs DNS servers listed as it primary and secondary DNS (essentially its forwarders).
Your internal DNS servers should list themselves first, then each other and be set to forward requests recursively on to your externally facing DNS.
Avatar of Robert Hall

ASKER

If I rephrase your reply, you are saying that SERVER55 cannot be a domain member.  Is that correct?

If I change SERVER55's TCP/IP settings, DNS Servers, to point to our ISP's DNS servers, it will not be able to resolve the IPs of our DCs as external DNS servers have no knowledge of our DCs internal IPs.

Recursion is disabled on SERVER55.  We do not wish to provide an open DNS server to the public.  SERVER55 should only respond to queries for the four domain names it is hosting.

Is there another solution?
ASKER CERTIFIED SOLUTION
Avatar of Bruno PACI
Bruno PACI
Flag of France image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I would recommend separating your internal and external DNS servers.  In my environment my external and internal domain  names are separate (e.g.  Ext: domain.com vs Int: company.local).   Then internally I have our external domain.com zone listed in our internal AD DNS with internal (DMZ) DNS names and internal IP addresses.  

All internal systems point to internal DNS servers, which forward to external ISP DNS servers for any external DNS resolutions.

Our external DNS server entries on our external SOA also have a 30 second TTL (vs the typical 7200 seconds), which enables faster Internet propagation when changes are made; through DNS Cache expiration.
sorry I misread the question.

But I would second PaciB's question on where are you using DNS Stuff tool from? internal or external to your firewall?

And if you are not separating your internal and external DNS structure, you may want to consider placing Server55 in a DMZ to make sure requests are staying where they should.