Link to home
Start Free TrialLog in
Avatar of brianounsted
brianounsted

asked on

RPC service unavailable Server 2003/2008

I have two domain controllers, DC1 and DC2.  DC2 is a new DC and is having problems becoming a full DC because it cannot replicate through the File Replication Service.  DC1 is a MS server 2008 + DNS and DC2 is a virtual MS server 2003 + DNS.  Using net share on DC2 shows that SYSVOL and NETLOGON shares are missing.
The Event Viewer, File Replication Service, on DC2, indicates a 13508 error every hour.
I have tried the following on both DCs:
netdiag /fix    -no errors except domain controller failure on DC2
dcdiag /test:frsevent    -error is, DC2 failed test frsevent
ntfrsutl  version DC1 <FQDN  and then DC2 <FQDN    -both seem to work OK
ntfrsutl sets  -this test indicates LastSndStatus: RPC_S_SERVER UNAVAILABLE
If I try to force replication from AD Sites and Services of DC2, I get an error, The naming context is in the process of being removed or is not replicated from the specified server.
This server, DC2, has been operating for several weeks, and nobody noticed it had not finished the DCPROMO cycle.  I was going to demote it and try again, but it wont demote gracefully because it is not yet a full DC.  So rather than the hassle of a forced demotion I thought I would try and fix it.
We have checked all the obvious things, like firewall, routers, anything that might block RPC.  Using Event Viewer and connecting to another computer works both ways from DC2, but not from any other computer to DC2.  The error reported is : The RPC server is unavailable.  It has been my experience that almost all of these kinds of errors are traceable to a faulty DNS installation but I cant find anything wrong with the DNS.
I was hoping that maybe someone at Experts Exchange might have an answer we have overlooked.  We are willing to try anything to resolve this problem.  
Thank you
Brian
Avatar of ChiefIT
ChiefIT
Flag of United States of America image

Avatar of brianounsted
brianounsted

ASKER

Don't see the revelence.  This DC is not a multihomed server and workstations have no problem with log in.  DC2 is simply a back up (or secondary) DC that has not quite made it.

I think if I was you, I would recommend we squash the machine and start over.  I just wanted to know the fix in case I run in to it again.

Thanks
Brian
Wen you bring a new server on line, the SRV records and Host A records need to be put on that server, then replicated to other DCs, especially the replication partner. The SRV records are used for things like LDAP, replication, netlogon. Without those records straight, communication with the new server is difficult. You will probably see events 4004 and 4015 saying the DNS server doesn't exist. On top of all that, many problems with the RPC server occure when it relies upon these SRV records. The SRV records in DNS are important to the operation of the DC that has just come on line.

As far as a multihomed server, this is a problem with both NICs or IPs register the SRV records. Often on a multihomed server, you will see errors like "this DC does not exist or can not be contacted" This happens when the client conatcts the server via DNS and the server sees the one NIC as busy and sends out the reply for services on the wrong NIC.

In either case, the lack of SRV or too many NICS can prevent a direct path back to the client. However, you will still be able to join the domain and logon. You may, when requesting services from the Domain controller receive "Domain can't be contacted" or "there are currently no logon servers to process your request" or "RPC server not available"

It's not well documented that finishing up bringing a server on line will require you straighten out these SRV and HOST A records.
With that said:

That is not a gaurantee that is your problem. Many things can knock down the RPC server. Endpoint mapper protection and CA certs can also make the RPC server unavailable. Even antivirus and firwalls that block out things above port 1024 could knock down the RPC server.
See http://support.microsoft.com/kb/257338 about troubleshooting missing netlogon and sysvol shares.

A reason for preventing the DC from promoting correctly can be that replication isn't working because the firewall is enabled and blocking the necessary ports. See http:/support.microsoft.com/kb/555381 for information about howto configuring DC-communication to work over firewall.
Gentlemen, than you for your input.
I have been working on this problem on and off for about a week so I have checked the more obvious solutions.  Some of the tests I have run are listed in my original help request.
I have throughly checked the SRV records and they seem correct and intact.  I also compared the records to another operating system and they seem to match.
The only other error message I get, other than the 13508, is attched, along with the associated text.  But as with many error messages there is little helpful information.
I think my next step is a forceful demotion and redo.
Thanks,
Brian

error2.jpg
error1.txt
ASKER CERTIFIED SOLUTION
Avatar of ChiefIT
ChiefIT
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial