Link to home
Start Free TrialLog in
Avatar of pramod1
pramod1Flag for United States of America

asked on

ACTIVE DIRECTORY

what are the results which one should see in dcdiag output

as we one of our exchange server is having rpc failure and it is talking to one of the AD server
Avatar of Maclean
Maclean
Flag of New Zealand image

Should show Success on all tests... if there's a failed test you would need to review it.
Avatar of pramod1

ASKER

what tests should I focus on with regard from exchange point of view, like kcc or anything else?
Avatar of pramod1

ASKER

the guy just ran this C:\Windows\system32>dcdiag /test:replications? didn't give any result
Exchange is not monitored using DCDIAG. DCDIAG & REPLMON are for domain controllers only.
Running dcdiag /test:replications would display results for connectivity & partition tests.
Other options I use are:

repadmin /showrepl to show replication.
dcdiag /e /c /v

You could copy the below code, save it to a batch file (filename.bat), and run it as admin. It will drop the results as a text file on your desktop.

@echo off

echo Running dcdiag /e /c /v...
dcdiag /v >> %userprofile%\desktop\ad_diag.log
echo Running dcdiag /test:DNS /DNSALL (may take a few moments, be patient)...
dcdiag /test:DNS /DNSALL /e /v >> %userprofile%\desktop\ad_diag.log echo Running dcdiag /test:DcPromo /e /v...
dcdiag /test:DcPromo /e /v >> %userprofile%\desktop\ad_diag.log echo Running dcdiag /test:RegisterInDNS...
dcdiag /test:RegisterInDNS >> %userprofile%\desktop\ad_diag.log echo Running netdiag.exe /v...
netdiag.exe /v >> %userprofile%\desktop\ad_diag.log
echo Running netsh dhcp show server...
netsh dhcp show server >> %userprofile%\desktop\ad_diag.log echo Running repadmin /showreps...
repadmin /showreps >> %userprofile%\desktop\ad_diag.log
echo Running repadmin /replsum /errorsonly...
repadmin /replsum /errorsonly >> %userprofile%\desktop\ad_diag.log echo ...
echo Diagnostic Completed Successfully...
echo view results in %userprofile%\desktop\ad_diag.log
pause
echo ...
echo ...
echo ...
echo General Health Ratio (FAILS/PASSES) 
echo This is very general,  check %userprofile%\desktop\ad_diag.log for info
echo ...
echo NUMBER OF FAILS
find /c /i "fail" %userprofile%\desktop\ad_diag.log
echo ...
echo NUMBER OF PASSES
find /c /i "pass" %userprofile%\desktop\ad_diag.log
pause

Open in new window


As for Exchange I usually run either the Best Practice analyzer (Start>>Run>>ExBpa>>ENTER) or from Exchange PowerShell run Get-ServerHealth -server YOUREXSERVER | format-table -autosize > C:\ServerHealth.log (Drops log onto C drive)
YOUREXSERVER would refer to your mail server name.
Avatar of pramod1

ASKER

dcdiag /test:replications :
Running partition tests on : ForestDnsZones

   Running partition tests on : DomainDnsZones

   Running partition tests on : Schema

  Running partition tests on : Configuration

does it show more results? or should I rundcdiag /e /c /v also? is it relevant
Avatar of pramod1

ASKER

he didn't post any further result.should I have expected more from dcdiag /test:replications
What are you trying to achieve? Depending on what you are testing for it could or could not be relevant.
If you merely want to know whether partition & connection tests are ok then running the relication test should be ok.
In addition to the dcdiag /test:replications I would run the repadmin /showrepl command.
This is provided that you are checking AD Replication health etc.

The dcdiag /test:replications usually looks like this (I blacked out some info not relevant to the results)

User generated image
The DCDIAG E/ /C /V is a more detailed status of your domain controllers health.
But neither of these options are very relevant to exchange.
Are you able to explain your issue/goal perhaps please?
Avatar of pramod1

ASKER

there was RPC failure one day on one of the dag servers  and we could not find the root cause so we expect losing contact with preferred DC
Below error
A server-side administrative operation has f ailed. The Microsoft Exchange Replication se
                                   rvice may not be running on server . Specific RPC error message: Er
                                   ror 0x6d9 (There are no more endpoints avail
                                   able from the endpoint mapper) from cli_GetC
                                   opyStatusEx2
Ok in this case there is no need to run dcdiag or replmon as this is an exchange side issue.
Either your service was not running. Check services to see if Exchange Replication is running, and check the system event log for Event ID 7024,7035,7036.

In addition running the Exchange Best Practice Analyzer might add some info on what could be improved on the server.
But besides checking event logs which is what I would review first, the Exchange Powershell Command Get-ServerHealth -server SERVERNAME | format-table -autosize > C:\ServerHealth.log might give you a lot more indication on potential issues.

In addition perhaps the system patched & rebooted.
Sometimes it can trigger an alert if the service had not started yet when the DB tried to mount. I would look for a reboot during the time this error occured. It might be an innocent explanation such as post patching restart.
Avatar of pramod1

ASKER

Get-ServerHealth -server SERVERNAME | format-table -autosize > C:\ServerHealth.log
get-server health  it says not recognized as commnandlet I am running on exchange 2010 DAG
RIght my bad. That indeed does not work for older Exchange Versions. I had assumed it was Exchange 2013 or upward.
Check the event logs if not done so already. In addition run the Exchange Best Practice Analyzer which might display some bottlenecks.
Avatar of pramod1

ASKER

no we never patched that server
Avatar of pramod1

ASKER

I don't see any events except 7035 which is passed, can it be related I/o error
Avatar of pramod1

ASKER

it was 1 week before
Avatar of pramod1

ASKER

there is another hub server where winrm went down
Auch. You probably do want to patch it. There are a gazillion high risk vulnerabilities out there which you are susceptible to without them..
In addition there are separate Cumulative Updates for Exchange to improve stability, reliability & performance besides bug fixes.
The latest is CU18 here
Another important one is the May 2017 Rollup to protect against the Petya Ransomware.
This might not be relevant to your issue right now (Security patching) but the CU18 might potentially be applicable as there are some improvements throughout the rollups which impact's area's reported by the error. But for now check the event logs and report back if it reports any reboots or service shutdowns.
Also when did you last reboot if you do not patch it?

And what is the service reported by Event ID 7035 please?
Avatar of pramod1

ASKER

it passed
Surprised. Not often that all Best Practice Checks pass. Must be fairly well maintained :)
So the system has not rebooted recently, there are no event ID's showing problems with the exchange services, best practice is clean.
In this case the only thing left really would be to monitor for recurrences and patch the server. The error is server side related so unless it lost connection to the DAG which you might be able to check through cluster management, there is not much to go on besides writing it off as a hiccup and monitoring for recurrences unless others have more suggestions to add.
Avatar of pramod1

ASKER

I see one day before that  hub server lost connection with mail gateway due to networking issue , can this be cause of rpc and winrim going down
If RPC goes down, winrm won't respond. But for RPC to go down one would assume that there has to be an Event ID with an error under either system or applications during the time where it was unavailable. Sometimes it could be that AntiVirus is doing a scan, and not all exchange components have been properly excluded from the scan. Other typical causes are reboots, network congestion and server running out of resources. But with the event logs pretty clean from my understanding it will be near impossible to trace the cause and leave us to do a lot of guess work.

You might want to supply some more data.
e.g. Are you running on VMWare or Hyper-V, and if yes which version.
What version and patch level of Windows are you running.
Does Exchange 2010 have any service packs applied (Assuming no as it was never patched)
Did any other servers report issues during the same time frame.
Is this a repeat issue or a one off etc.
Avatar of pramod1

ASKER

rpc went down on mailbox server on same day and winrm went down on another hub transport server on same day
Avatar of pramod1

ASKER

Exchange Server 2010

Microsoft Corporation

Version: 14.03.0361.001
Avatar of pramod1

ASKER

one time issue, VMware version 5.5
Avatar of pramod1

ASKER

The Microsoft Exchange Replication service terminated unexpectedly.  It has done this 1 time(s).  The following corrective action will be taken in 5000 milliseconds: Restart the service. (7031) it says on mailbox server
Ok thats something. What other events occurred around that time frame? Anything leading up to it? There might be informative events or other warnings & alerts providing some insight.
Avatar of pramod1

ASKER

resource exhaustion detector- low virtual memory, search indexer stopped unexpectedly,The Microsoft Search  (Exchange) service terminated unexpectedly.  It has done this 7 time(s).these were just before the rpc
Avatar of pramod1

ASKER

The following programs consumed the most virtual memory: store.exe (13700) consumed 38865256448 bytes, svchost.exe (2000) consumed 1129340928 bytes, and w3wp.exe (10944) consumed 848666624 bytes.
ASKER CERTIFIED SOLUTION
Avatar of Maclean
Maclean
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pramod1

ASKER

so can this issue be because of memory? rpc terminated?
Avatar of pramod1

ASKER

where can I check cu 18 update ?
Avatar of pramod1

ASKER

resource exhaustion detector- low virtual memory, search indexer stopped unexpectedly,The Microsoft Search  (Exchange) service terminated unexpectedly.  It has done this 7 time(s).these were just before the rpc

any reasons why resource exhaustion detector occurred?
CU18 can be obtained here. Will potentially take a few hours to apply.

To check your current version start Exchange Powershell and enter the command Get-Command ExSetup | ForEach {$_.FileVersionInfo} the results will give you a build number which you can compare here to see what patch level you are on.

And yes, not having enough resources to run all the required on Exchange can result in RPC errors.
Applying the CU18 might alleviate pressure to lower memory consumption, as there have been various improvements on exchange since its release. Also applying Exchange 2010 Service Pack 3 if not done so yet is recommended. (Plus all critical and security patches for security reasons)

Before making any changes always make sure you got a proper backup of course.
As for memory. If still having issues post patching you can try to increase virtual memory first. But the performance will not be as good as increasing physical (or virtually assigned if using VMWare/Hyper-V/Acropolis) system memory
In addition make sure that your antivirus is setup with the proper exclusions for Exchange 2010.

Exclusion references can be located on the below url.

https://technet.microsoft.com/en-us/library/bb332342(v=exchg.141).aspx

Hope all this helps.
Good luck.
Avatar of pramod1

ASKER

but any specific reason why resource exhaustion detector occured repeatedly one day  and then after rebooting didnt show up again
If the system does not reboot frequently, or if it was handling a large bulk email it might have had trouble keeping up.
It all depends on what the system was doing on that point in time, but being in the past it won't be obvious as you generally need to catch the issue as it occurs.