Link to home
Start Free TrialLog in
Avatar of Compuit1
Compuit1

asked on

RPC server is not available, Naming context specified for this replication operation is invalid

The server 2013 network is made up of three servers. SQL01, EXCH01 and TS01.

SQL01 is a DC
EXCH01 is also a DC running exchange 2013 network (Not recommended)
TS01 is a Terminal Server

After a period the users lose connection with EXCH01. We know this because it is manifested by Outlook 2013 getting unhappy that exchange has disappeared.
When this occurs and if users attempt to logon over this period they receive an error saying that the specified domain does not exist or could not be contacted. See notes attached.Naming-context-specified-for-thi.pdf

How is this fixed because when EXCH01 has this issue or is offline users receive this message that there is no Domain Controller available - What is SQL01 doing …. it will not authenticate users when EXCH01 is down?

The AD replication is broken but is it broken on SQL01 or EXCH01?
Avatar of Mahesh
Mahesh
Flag of India image

Does SQL01 and Exch01 are both in same AD site ?

OR

Have you specified only Exch01 as configuration DC, GC for Exchange?

Have you tried manually triggering replication between both domain controllers in AD site services ?
You can delete AD connection objet in Ad sites and services and recreate them as necessary
and check if it works

Mahesh
Avatar of Compuit1
Compuit1

ASKER

Does SQL01 and Exch01 are both in same AD site ?    -  Yes ... same subnet and can ping also can get to the netlogon and SYSVOL OK.

OR

Have you specified only Exch01 as configuration DC, GC for Exchange?  - No...  SQL01 appears to hold all FSMO roles but the strange thing is if I shutdown EXCH01 no one can logon .......

Have you tried manually triggering replication between both domain controllers in AD site services ? - I will need guidance with this ....

You can delete AD connection objet in Ad sites and services and recreate them as necessary    -   I will need guidance with this ....
Hi Mahesh,

I ran dcdiag on SQL01 and the only failure was:

Starting test: DFSREvent
     There are warning or error events within the last 24 hours after the SYSVOL has been shared.  Failing SYSVOL replication problems may cause Group Policy problems.
     ......................... SQL01 failed test DFSREvent

EXCH01 gave a similar result:
  Starting test: DFSREvent
     There are warning or error events within the last 24 hours after the SYSVOL has been shared.  Failing SYSVOL replication problems may cause Group Policy problems.
     ......................... EXCH01 failed test DFSREvent
See the PDF ..... Sites and Services on EXCH01

Why does sites and Services on EXCH01 show GC unavailable and I am on the local host EXCH01 this is bizarre..... ???
Sites-and-Services.pdf
Go TO ad sites and services
navigate to site containing both Domain Controllers
I guess both DCs are in same site
Under each DC you will find ntds settings
under ntds settings at right hand you will find connection objects
Just right click connection objects and click on replicate now

Please check below article for more information regarding replication check
http://social.technet.microsoft.com/Forums/en-US/dbfba096-5544-4f3d-8859-ae405c18d489/ad-check-health

Also can you please post output for below commands on SQL01
dcdiag /q
repadmin /showrepl

Mahesh
OK here are the results from SQL01: Checking document now.....

         ......................... mdmdmd passed test Intersite
PS C:\Users\Administrator> dcdiag /q
         There are warning or error events within the last 24 hours after the SYSVOL has been shared.  Failing SYSVOL
         replication problems may cause Group Policy problems.
         ......................... SQL01 failed test DFSREvent
PS C:\Users\Administrator> repadmin /showrepl

Repadmin: running command /showrepl against full DC localhost
Default-First-Site-Name\SQL01
DSA Options: IS_GC
Site Options: (none)
DSA object GUID: e1a3c1b1-3134-44ef-a333-c3a96b9deae3
DSA invocationID: e1a3c1b1-3134-44ef-a333-c3a96b9deae3

==== INBOUND NEIGHBORS ======================================

DC= mdmdmd,DC=co,DC=nz
    Default-First-Site-Name\EXCH01 via RPC
        DSA object GUID: 705218f0-e43d-4ac9-8a13-58da0f04b00b
        Last attempt @ 2014-03-19 01:34:33 was successful.

CN=Configuration,DC= mdmdmd,DC=co,DC=nz
    Default-First-Site-Name\EXCH01 via RPC
        DSA object GUID: 705218f0-e43d-4ac9-8a13-58da0f04b00b
        Last attempt @ 2014-03-19 01:34:02 was successful.

CN=Schema,CN=Configuration,DC= mdmdmd,DC=co,DC=nz
    Default-First-Site-Name\EXCH01 via RPC
        DSA object GUID: 705218f0-e43d-4ac9-8a13-58da0f04b00b
        Last attempt @ 2014-03-19 00:55:32 was successful.

DC=DomainDnsZones,DC= mdmdmd,DC=co,DC=nz
    Default-First-Site-Name\EXCH01 via RPC
        DSA object GUID: 705218f0-e43d-4ac9-8a13-58da0f04b00b
        Last attempt @ 2014-03-19 00:55:32 was successful.

DC=ForestDnsZones,DC= mdmdmd,DC=co,DC=nz
    Default-First-Site-Name\EXCH01 via RPC
        DSA object GUID: 705218f0-e43d-4ac9-8a13-58da0f04b00b
        Last attempt @ 2014-03-19 00:55:32 was successful.

PS C:\Users\Administrator>
It seems that AD replication is successful from Exch01 to SQL01

If you could run above two commands from Exch01 and post here output please

Also run below command on Exchange Shell to identify which domain controller is set for lookups
http://exchangeserverpro.com/how-to-use-a-specific-domain-controller-in-exchange-2010-management-shell/

Mahesh
OK here are the results from EXCH01: I am not certain if you saw the Sites and Services PDF attached earlier - It shows EXCH01 Unavailable.

PS C:\Users\administrator.mdmdmdmd> dcdiag /q
         An error event occurred.  EventID: 0xC0001B77
            Time Generated: 03/19/2014   01:44:32
            Event String:
            The Microsoft Exchange Replication service terminated unexpectedly.  It has done this 1 time(s).  The follow
ing corrective action will be taken in 5000 milliseconds: Restart the service.
         An error event occurred.  EventID: 0xC0001B78
            Time Generated: 03/19/2014   01:44:37
            Event String:
            The Service Control Manager tried to take a corrective action (Restart the service) after the unexpected ter
mination of the Microsoft Exchange Replication service, but this action failed with the following error:
         An error event occurred.  EventID: 0xC0001B77
            Time Generated: 03/19/2014   01:56:05
            Event String:
            The Microsoft Exchange RPC Client Access service terminated unexpectedly.  It has done this 1 time(s).  The
following corrective action will be taken in 5000 milliseconds: Restart the service.
         An error event occurred.  EventID: 0xC0001B78
            Time Generated: 03/19/2014   01:56:10
            Event String:
            The Service Control Manager tried to take a corrective action (Restart the service) after the unexpected ter
mination of the Microsoft Exchange RPC Client Access service, but this action failed with the following error:
         ......................... EXCH01 failed test SystemLog
PS C:\Users\administrator.mdmdmdmd> repadmin /showrepl

Repadmin: running command /showrepl against full DC localhost
Default-First-Site-Name\EXCH01
DSA Options: IS_GC
Site Options: (none)
DSA object GUID: 705218f0-e43d-4ac9-8a13-58da0f04b00b
DSA invocationID: f1605da3-bfd3-4899-a760-4f9c9d96e49a

==== INBOUND NEIGHBORS ======================================

DC=mdmdmdmd,DC=co,DC=nz
    Default-First-Site-Name\SQL01 via RPC
        DSA object GUID: e1a3c1b1-3134-44ef-a333-c3a96b9deae3
        Last attempt @ 2014-03-19 02:02:49 was successful.

CN=Configuration,DC=mdmdmdmd,DC=co,DC=nz
    Default-First-Site-Name\SQL01 via RPC
        DSA object GUID: e1a3c1b1-3134-44ef-a333-c3a96b9deae3
        Last attempt @ 2014-03-19 01:57:16 was successful.

CN=Schema,CN=Configuration,DC=mdmdmdmd,DC=co,DC=nz
    Default-First-Site-Name\SQL01 via RPC
        DSA object GUID: e1a3c1b1-3134-44ef-a333-c3a96b9deae3
        Last attempt @ 2014-03-19 01:54:45 was successful.

DC=DomainDnsZones,DC=mdmdmdmd,DC=co,DC=nz
    Default-First-Site-Name\SQL01 via RPC
        DSA object GUID: e1a3c1b1-3134-44ef-a333-c3a96b9deae3
        Last attempt @ 2014-03-19 01:54:45 was successful.

DC=ForestDnsZones,DC=mdmdmdmd,DC=co,DC=nz
    Default-First-Site-Name\SQL01 via RPC
        DSA object GUID: e1a3c1b1-3134-44ef-a333-c3a96b9deae3
        Last attempt @ 2014-03-19 01:54:45 was successful.

PS C:\Users\administrator.mdmdmdmd>
I saw the PDF
Replication is OK between both DCs
I think the issue exists with AD and Exchange being on the same Server and that is why users are getting disconnected

The possible fix I can see is to isolate Exchange from Domain Controller
OR
You could log a call with MS Support to isolate and resolve the issue

Some expert might already faced this issue can help you

Mahesh
1.I would like to make a suggestion and demote EXCH01 and remove any DC roles from it. MS do not recommend Exchange 2013 installed on a DC. Now given the issue on hand that if I take EXCH01 offline the users cannot logon - Really weird because while EXCH01 shows errors above it seems to be important. if EXCH01 is not on line the users are stranded.....  what is the SQL01 DC doing when it shows all is good? Should it not serve the logon requests and given it holds all the FSMO roles it is expected that the users should be able to logon?

2. Another question given the state of AD will it be safe to demote EXCH01?  

3. DC is at fault?
Yes I have logged a call with MS but I they (MS) will be taking 12.5 hrs before they will get back to me......

For point 3. Which DC is actually at fault because if EXCH01 is turned / Shutdown no one can log on and yet SQL01 appears to be chief and faultless but not effective?
Mahesh - Thank you for your input. I will look at posting a result here. However very keen to hear from anyone else
You cannot demote DC directly prior to demote Exchange, other wise it will break exchange

You need to demote \ uninstall Exchange 1st, and need to install it on another server \ VM
but again If this is your only Exchange server you can't do that
In that case you need to deploy one more Exchange server on VM\ another server 1st

I am just wondering what DNS settings you have on client computers ?

Please try below

Keep SQL01 as primary and remove secondary address from client computers DNS entry and check what is happening.. whether clients are able to login ?
If Yes, then you don't have any issue with SQL01 and probably AD+Exchange on same box may be culprit

Mahesh
OK we only use a terminal server. On the terminal server (TS01) I removed the DNS 10.0.0.9 entry that pointed to EXCH01. I shut down EXCH01 and could still logon to the Terminal Server. I tested a few users and it worked - This is a big improvement. Of course the logon takes longer because there are a few network drives that are no longer available. But the users can log on. I will proceed to restart the TS01 (Terminal server) and see what occurs with only the SQL01 online.
Ok, Fine
With Exch01 shutdown \ disconnected from network, On client workstations go to run and enter %logonserver% and check if they are able to resolve NetBIOS name of SQL01, reset their passwords, in that case you don't have issues with SQL01

Mahesh
OK I did all with EXCH01 shutdown.

1. The run command %logonserver% gives the explorer window showing the Netlogon and sysvol shares on SQL01 (Good)

2. Went to the SQL01 DC and reset a users password and when the user attempted to logon the new password had to be entered (Good)

What is the next step shall I bring the EXCH01 online or are there a few more things to cover off?

Thank you, it appears that we are making headway.
yes, you are right,
now you could bring Exch01 online and check if issue still exists,
Now Please reboot SQL01 once after Exch01 came online
After successful reboot of SQL01, Ensure that your client DNS is pointing to SQL01 for preferred entry and Exch01 is not listed in the list
Now try to logon on client machine and check if it works, if yes
Then again shutdown Exch01 and reboot client computer as well and check if it is able to logon successfully
If this works, then I think your issue is resolved, may be you will have to find RCA then

Mahesh
I see SQL01 has a DNS entry to the EXCH01 as well - Should that be remove as was achieved on TS01? Bringing system EXCH01 online now.
OK checked the client DNS not pointing to EXCH01 however after a period with EXCH01 back on line the issue returned with Exchange not being available to users logged on to TS01. Network drives and Internet OK.
I lean towards the thinking that Exchange 2013 is installed on a DC is not recommended. Will need to investigate how to remove AD from the exchange box EXCH01.
I did test resetting a user's password using EXCH01 it seemed to take long... Users and Computers stopped responding but eventually managed to complete the reset / change. I could later logon to TS01 with the new reset password.

So I suspect the EXCH01 will need to be demoted / AD uninstalled.
The only valid option is to deploy one more Exchange Server (I assume you have only one Exchange server) and then uninstall Exchange from Exch01 and then you can demote DC role from Exch01

You may try B grade call with MS support (They will charge per call basis) and they will work with you until it get resolved

Mahesh
ASKER CERTIFIED SOLUTION
Avatar of Compuit1
Compuit1

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Good to hear that you have resolved problem

Just wanted to understand old decommissioned DNS entry is where entered \ remained ?
In Preferred DNS on Exch01 and SQL01 OR in forwarders or it is listed in name servers list in dns zone ?

Mahesh
Need to Allocate some points to Mahesh and add the following:

This appears to be a simple solution yet if DNS is not 100% things can go horribly wrong. The command line testing using "netdom", "repadmin" and "dcdiag" may show things are in order - it will be worth the while to check every component of DNS.

In this case the faulty entry 10.0.0.4 was only in the DNS forwarders for both EXCH01 and SQL01.
This is sad the DNS issue is back and users are not able to logon. The system worked for a few days. On EXCH01 there is a definite error as follows.
Error ID: 4015
Microsoft-Windows-DNS-Server-Service
The DNS server has encounted a critical error from the Active Directory. Check that the Active Directory is functioning properly. The extended error dedug information (Which may be empty) is "". The event data contain the error

I found that from host EXCH01 I could ping the other host names such as sql01 and ts01 but could not ping their fully qualified host name such as sql01.domainname.xx.xx

DNS / AD on EXCH01 seems broken - I would like to work from the ground up with trouble shooting this DNS / AD issue please.
The issue is back so the solution I provided is not adequate.
Better you please log a call with Microsoft as I think there is some problem with core functionality causing your communication is braking
OK the issue has been resolved and the site is now running well. The root problem is DNS and here is what I did in desperation.
On EXCH01 - Exchange Server - Backup email.
EXCH01 uninstall Exchange 2013 using adsiedit.msc
Demote EXCH01 from DC role (No more GC)
Check EXCH01 OK do the restart etc
No more user logon issues to the domain

Build new EXCH02 Win2012 server
Install Exchange 2013 and configure on EXCH02
Restore email to EXCH02.
From TS01 attempt to connect Outlook 2013 to EXCH02 - Failed - Found Outlook 2013 broken - Repaired and connect OK.
Outlook connection inconsistent - User would logon and connect OK. Logoff and attempt to logon to Outlook again and fail (Unable to open folders).
Determine DNS to new EXCH02 not consistent
Edit hosts file on TS01 terminal server and put entry in for EXCH02 and problem went away.
No more Outlook issues.      

The underlying  DNS issue will still need to be resolved but with the hosts file in place, while not ideal, offers some grace.
This appears to be a simple solution yet if DNS is not 100% things can go horrible wrong. The command line testing using "netdom", "repadmin" and "dcdiag" may show things are in order it will be worth the while to check every component of DNS.