Solved

Split Brain/Horizon DNS Headache

Posted on 2013-01-27
5
680 Views
Last Modified: 2013-02-20
We have what I have seen referred to as a "Split Brain" DNS configuration.  Brief details as follows:

SERVER52.D1.COM      W2K8 R2 PDC (DNS server to 192.168.0.0/16). IP: 192.168.5.2/16
SERVER53.D1.COM      W2K8 R2 DC (DNS server to 192.168.0.0/16). IP: 192.168.5.3/16

SERVER55.D1.COM      W2K8 R2 Public-facing DNS Server. IP: 192.168.5.5/16 (MIP: 202.xxx.yyy.125/26)

OSNS1            Off-site secondary DNS Server
OSNS2            Off-site secondary DNS Server
OSNS3            Off-site secondary DNS Server

WWW.D2.COM            Web Server.  IP: 192.168.1.197/16 (MIP: 202.xxx.yyy.70/26)
MAIL01.D2.COM      Mail Server.  IP: 192.168.0.189/16 (MIP: 202.xxx.yyy.76/26)
WEBMAIL.D2.COM      Web mail portal.  IP: 192.168.0.191/16 (MIP: 202.xxx.yyy.100/26)

All on-site DNS servers are running MS DNS included with W2K8 R2.

SERVER55 is a domain member.  It's TCP/IP, DNS Server settings point to SERVER52 and SERVER53.  

SERVER55's DNS Server is configured with four Primary (non-AD integrated) forward lookup zones (D1, D2, D3, D4.com).  Each zone contains NS records for SERVER55 and the three off-site secondary servers.  Zone transfers are allowed 'to all name servers listed on the name server tab'.  Notifications are sent to the same list.

If I increment the serial number for a zone on SERVER55, the change is replicated successfully to the three off-site servers.  This tells me notifications and zone transfers are working properly.

All of the A records defined in the zones on SERVER55 use 202.xxx.yyy.zzz/26 IPs.  All of the A records defined in the machine domains on the two DCs use 192.168.0.0/16 IPs.

SERVER55's public IP is MIPed by the firewall to 192.168.5.5/16. A policy permits DNS traffic from Untrust to Trust. Queries are received, processed and returned.

I hope I have provided the correct level of detail above.  Drum roll please....

The Problem

I use DNS Stuff's professional toolkit (which does not used cached data) for troubleshooting. If I perform an A lookup on www.D2.COM (or mail01 or webmail..) against any of our off-site DNS servers, I receive 100% correct results (202.xxx.yyy.zzz IPs).  If I use the MMC DNS snap-in and the 'Run NSLOOKUP' option against SERVER55 for www.D2.COM (or mail01 or webmail...), the IPs returned are 202.xxx.yyy.zzz IPs.

If, however, I run the queries against SERVER55, 100% of the time I receive the 192.168.0.0/16 address that is defined in the zone of the corresponding domain name on SERVER52 and SERVER53.

Examples:

DNS Stuff Query to OSNS2            www.D2.COM            Results:            202.xxx.yyy.70
DNS Stuff Query to OSNS3            mail01.D2.COM            Results:            202.xxx.yyy.76
DNS Stuff Query to OSNS1            webmail.D2.COM      Results:            202.xxx.yyy.100

NSLOOKUP Query On SERVER55            www.D2.COM            Results:            202.xxx.yyy.70
NSLOOKUP Query On SERVER55            mail01.D2.COM            Results:            202.xxx.yyy.76
NSLOOKUP Query On SERVER55            webmail.D2.COM      Results:            202.xxx.yyy.100
(Above results are as expected)

DNS Stuff Query to SERVER55      www.D2.COM            Results:            192.168.1.197
DNS Stuff Query to SERVER55      mail01.D2.COM            Results:            192.168.0.189
DNS Stuff Query to SERVER55      webmail.D2.COM      Results:            192.168.0.191
(Not what we're looking for!)

The only thing that I can think of is that the results obtained by SERVER55's DNS Client (when querying SERVER52 and SERVER53) are being used in the responses sent by SERVER55 rather than using the data contained in SERVER55's DNS Zones.  It has always been my (perhaps wrong!) understanding that the local primary zone files are all-powerful and take precedence over the results returned by the client service on the machine.

Fortunately the public-facing DNS server zone data is quite static. To overcome the issue of private IP addresses being returned when the query is routed to SERVER55, I have stopped the DNS Server service on that machine.  Timeouts are being experienced, but at least the public is being given IPs they can access.

Is it true a W2K8 R2 machine will use DNS Client results in a DNS Server response, even if the name appears in a local DNS primary zone?  Can this priority be changed?  If so, references to knowledgebase articles or similar would be much appreciated.

Thank you for reading!  I'm happy to provide as much further detail as I can.

Regards,
Rob
0
Comment
Question by:Robert Hall
5 Comments
 
LVL 13

Expert Comment

by:Ugo Mena
Comment Utility
Your public facing (SERVER55) DNS should not have your internal (SERVER52 and SERVER53DNS) listed as its DNS servers.
Your public facing DNS server should have your ISPs DNS servers listed as it primary and secondary DNS (essentially its forwarders).
Your internal DNS servers should list themselves first, then each other and be set to forward requests recursively on to your externally facing DNS.
0
 
LVL 1

Author Comment

by:Robert Hall
Comment Utility
If I rephrase your reply, you are saying that SERVER55 cannot be a domain member.  Is that correct?

If I change SERVER55's TCP/IP settings, DNS Servers, to point to our ISP's DNS servers, it will not be able to resolve the IPs of our DCs as external DNS servers have no knowledge of our DCs internal IPs.

Recursion is disabled on SERVER55.  We do not wish to provide an open DNS server to the public.  SERVER55 should only respond to queries for the four domain names it is hosting.

Is there another solution?
0
 
LVL 16

Accepted Solution

by:
PaciB earned 500 total points
Comment Utility
Hi,

As far as I know the DNS server service DOES NOT take care of the DNS client cache.
So I strongly doubt that your issue is related to the DNS clinet on SERVER55.
I think if such a behavior exist I would have noticed that already on some customer Active Directory architecture.


By the way, when you use NSLOOKUP to interrogate SERVER55 it's exactly the same as a DNS clients interrogating SERVER55, so whatever the DNS client you use you should have the same answer.

Can you tell us exactly from where you make your DNS queries ?

When you say "NSLOOKUP on SERVER55" can you precise from where (on which machine) you launch NSLOOKUP and what you typed exactly in the NSLOOKUP console ?
When you say "DNS Stuff Query to SERVER55" can you precise from where do you launch the tool ?


Did you look at the DNS server cache on SERVER55 to ensure that no bad information is contained in the server cache ?
Can you verify that SERVER55 DNS server has no any alternative resolution mode enabled (the WINS lookup should be disabled in the DNS server properties) ?
Can you confirm that root hints on SERVER55 does not contain some bad informations ?

Can you try to use NSLOOKUP to query SERVER55 from an external computer and look at the answer ? (please use the SET debug=yes before the query to have verbose result).

How are you sure that the bad answer comes from SERVER55 ? Any chance that some firewall misconfiguration redirect DNS query to internal DNS server ?
To check it easily: create a brend new A record named "test" in your DNS zone on SERVER55 associated to any IP address (what you want, no need to exist), wait for a few minutes and make the DNS query test to query this record test.d2.com, does it resolves ?

Have a good day.
0
 
LVL 8

Expert Comment

by:gsmartin
Comment Utility
I would recommend separating your internal and external DNS servers.  In my environment my external and internal domain  names are separate (e.g.  Ext: domain.com vs Int: company.local).   Then internally I have our external domain.com zone listed in our internal AD DNS with internal (DMZ) DNS names and internal IP addresses.  

All internal systems point to internal DNS servers, which forward to external ISP DNS servers for any external DNS resolutions.

Our external DNS server entries on our external SOA also have a 30 second TTL (vs the typical 7200 seconds), which enables faster Internet propagation when changes are made; through DNS Cache expiration.
0
 
LVL 13

Expert Comment

by:Ugo Mena
Comment Utility
sorry I misread the question.

But I would second PaciB's question on where are you using DNS Stuff tool from? internal or external to your firewall?

And if you are not separating your internal and external DNS structure, you may want to consider placing Server55 in a DMZ to make sure requests are staying where they should.
0

Featured Post

Too many email signature updates to deal with?

Do you feel like you are taking up all of your time constantly visiting users’ desks to make changes to email signatures? Wish you could manage all signatures from one central location, easily design them and deploy them quickly to users? Well, there is an easy way!

Join & Write a Comment

Scenario:  You do full backups to a internal hard drive in either product (SBS or Server 2008).  All goes well for a very long time.  One day, backups begin to fail with a message that the disk is full.  Your disk contains many, many more backups th…
Possible fixes for Windows 7 and Windows Server 2008 updating problem. Solutions mentioned are from Microsoft themselves. I started a case with them from our Microsoft Silver Partner option to open a case and get direct support from Microsoft. If s…
This tutorial will walk an individual through locating and launching the BEUtility application and how to execute it on the appropriate database. Log onto the server running the Backup Exec database. In a larger environment, this would generally be …
This tutorial will walk an individual through the steps necessary to install and configure the Windows Server Backup Utility. Directly connect an external storage device such as a USB drive, or CD\DVD burner: If the device is a USB drive, ensure i…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

5 Experts available now in Live!

Get 1:1 Help Now