Link to home
Start Free TrialLog in
Avatar of DonWilder
DonWilderFlag for United States of America

asked on

How to start troubleshooting sudden slowdown of all W2003 network services to 100+ XP PRO and Vista Business clients?

w2003 server network - 80+ xp pro and 30 vista business clients - entire network suddenly slow, client logon slow, remote administrator login to dns and dhcp server (bdc) briefly accepted then thrown off, access to printers and folders slow.  Have restarted Internet broadband modem, Sonicwall, managed switches, and two servers - domain controller and back up domain controller.
Avatar of bpanowtv
bpanowtv
Flag of Hong Kong image

have u checked the windows server's event log to see if there are any errors? is your system being infected with virus?
Avatar of Hazem KUNNANA
This might be caused by:
- A virus in one or more of your hosts, trying to spread over the network.
- A Switch problem (over heat or hang for some reason)
- Specifically check the machines that have the following services for performance, viruses, network connections : (DNS, Active Directory)
Have you enabled universal group membership caching option in AD?
Scan the system for worm.viruses,which get installed in the background & consumes memory & process.
Check for one system & you will be able to rectify for others.
Go to the command prompt of your DC and let's see what DCdiag /v brings up.

Sounds like you may have a DNS error.


You also might want to run a Network Analyzer utility such as 'WireShark' http://www.wireshark.org/  to check what is going on in your LAN.
Avatar of DonWilder

ASKER

ChiefIT,
Running DCdiag from command line gets a "Not recognized as internal or external command, operable program or batch file."  Search of server C drive found dcdiag.exe.  I ran it as the .exe.  It runs in a black window like command line, seems to run all the optional tests in dcdiag scrolling down the screen, completes the tests and closes down before I can mark or copy the results.  dcdiag still gives "not recognized...etc." when run as command line.  

Any ideas for keeping the results on screen long enough to mark and copy?  Or getting it to run from a commandline prompt.

Thanks,

Don
Reinstall support tools & run into command prompt.

Yes, as ChiefIT and Awinish suggested, redownload and reinstall the Support Tools, then , from a command line (Run --->  type cmd), cd to the folder "Program Files\Support Tools" and run :  dcdiag.exe /v. Copy paste your results here.

P.S : and dont forget to replace your real domain name by a fake one ;p
I have logged on to PDC and BDC to check Event Logs.  I can see parts of the problem but I don't know how to restore or config dns, which I assume would then restore and sort out rpc and dhcp.  Currently it is not possible to ping either server from a client PC.  I believe dns is active on both servers, pdc and bdc.  Most recent errors are:
 
"The session setup to the Windows NT or Windows 2000 Domain Cntroller \\server200a for the domain LAS is not responsive.  The current RPC call from Netlogon on \\LAS-Server to \\server2003a has been cancelled."
"This computer was not able to set up a secure session with a domain controller in domain LAS due to the following:
The RPC server is unavailable.  this may lead to authentication problems.  be sure this computer is connected to the network."
"Active Directory attempted to perform a remote procedure call (RPC) to the following server. The call timed out and was cancelled."
Correction.  When servers are pinged response is inconsistent, i.e.  Ping serverA - returned; response timedout; response timedout.  Ping serverB - response timed out; returned; returned.  All connections on network for services - files, printers, internet, are similarly inconsistent.  Some links connect, some don't.  Printers print but take long time to respond, sometimes print one page again and again and again without showing any files in the print que.  Have to keep hitting the "cancel job" button or finally unplug the printer to stop it.
ASKER CERTIFIED SOLUTION
Avatar of Awinish
Awinish
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I saw the exact same thing on one of my lans today.

RPC service, dc can't be found, can't replicate, problems with printers and DNS, and also open up MMC and DNS has lost all of its records and is X-d out.

Tell me if this is what you are seeing?
Thanks for the possibilities.  I will check them out and get back to you later today or tomorrow.
ChiefIT,
That seems to be what I am seeing.  I checked DNS servers yesterday afternoon.  One is red x-d out.  I'll check records on the other later today.  Also seem to have lost netlogon for client PCs.  May have lost local logon on one server.  To late yesterday to follow that up.  Will use local logon on the other server later today.
I had to catch a plane to a new site, after seeing this and attempting to fix it.

You appear to have the same problem I did at one site.

If you click on that bumb DNS server snapin, it says that the forest no longer exists, do you want to delete it. I selected yes, because I am not in a single domain and don't wish to join a forest. That appeared to fix the issue, but I didn't have a heck of a lot of time to monitor it. I also saw lots of MMC problems, where MMC console wouldn't open. Instead I had to use the DNS snapin only.

I wonder if a DNS bug is going around???

When you bring up your domain as the first DC in the forest, it will create two MSDCS file folders. Let's go over what those DNS records are. The records in the MSDCS file folder are SRV records, (or SeRVice records). These point the way to the domain's services, like the authentication server, and replication partners. So, no wonder netlogons are jacked up, huh?? Remember I said there were two MSDCS file folders?? One holds the SRV records and will be located as its own forward lookup zone, the second holds a single delegation record and will be within your domain's forward lookup zone.

Well delegation records are not automatically updated. So, after a while the delegation records will gray out.

It looks like this:
https://www.experts-exchange.com/questions/24349599/URGENT-MSDCS-records-registering-directly-under-FWD-lookup-zone-not-under-FQDN-name-space.html

After following the advise on that thread by deleting both MSDCS file folders, DNS worked well for about a year. This makes me believe the fix on that thread wasn't the problem.

But, a year after that fix, all of a sudden, I had problems like you did. where one server was X-d out and the domain was having problems with replication, intermittent communications and domain logons.

There were a couple things I did and it appeared to resolve the issue. I didn't have time to evaluate if this was a solid fix.

I figured I had intermittent comms. So, I looked at the NICs duplex settings and found out that my 1Gb nic actually had a 100Mb connection. So, I set the duplex settings to 1000Mb Full auto, and waited for it to come up as a 1Gb connection.

Then, on the DNS snaping, I tried to access the bad server to fix DNS. It came up with an error that said, XXXforest no longer exists, do you want to delete it. I said yes.

Those things seemed to work. Like I said I had to catch a plane and couldn't evaluate the progress.

I am going to request additional assistance on this one. because I don't have a solid solution but have seen your problem. Wait one.
From one of the DCs, open up the Default Domain Controller group policy.

Find this entry:

Computer Configuration>Windows Settings>Security Settings>Local Policies>User Rights Assignment:: Enable computer and users accounts to be trusted for delegation

Make absolutely sure that Administrators is there.  If it isn't then add it by manually typing the group name in there (you can't browse to it because it's a local group and it will pre-pend the server name).  Once added refresh the policy on the DCs.

If that entry is correct then please run ADSIedit.msc and in the Domain container under the OU=Domain Controllers object expand then right click the server with issues and select properties.  Find and post the values for the following 2 attributes:

sAMAccountType
userAccountControl

A screenshot of your DNS FLZ would be nice to see too.

NM


I have run dcdiag and adsiedit.msc, got the values (wrote them down, send later), got screen shot of DNS FLZ which contains Host A list of all computer names and ips on network.  I am not able to send as network is down and I have to save to usb drive slots for which I haven't found on older Dell 700 tower server.  Other one has external HD connected to usb.  I'll try to get copies tomorrow.
Go to the command prompt and type,

Net stop netlogon
Net start netlogon

Let's see if your MSDCS DNS records come back.
Sorry to leave people hanging.  There wasn't an accepted solution but there was a lot of help in eliminating possible problems including running dcdiag, downloading and reinstalling Support Tools to get usable dcdiag and other tools, eliminating NIC cards as a possible problem.  kkunnana suggested a switch problem.  The Network admin and assistants came on Monday morning.  Because disconnecting and reconnecting the 4 switches did not result in them resetting themselves to 1,2,3,4 order (they were stuck on 1s and 0s inconsistently), the tech figured the problem was probably in switch 1, so he reconnected them in reverse order, switch 4 first, switch 3 second, switch 2 third.  They arranged themselves in correct order indicating a problem with switch 1.  It also happened that the servers, and internet were connected to the network via switch one.  They then tested the switch ports by pulling all the ethernet cables on switch one, and taking the other known good 3 switches off the network,  pinging the domain controller (I think, pinging something anyway), and adding one connection at a time to see what wasn't responding.  Narrowed it to one switch connection and traced it back to the patch panel which identified the drop in a classroom.  The classroom drop was used by a long ethernet cable up through the ceiling to the kitchen about 70' away.  The kitchen PC checked out OK.  Repairing the end of the cable (moving the wire resulted in on and off connection)  and changing from the wall drop to another switch in the room cleared the cable to the kitchen.  So there is a wire problem somewhere on the cable from the wall drop to the patch panel, or the cable from the patch panel to switch 1.  We added a known good switch to by pass the possible problems in switch one.  The network has been OK with a couple of exceptions, again some wire connections and we have not yet had time to try to put things back together the way they with the original switch configuration.  

I don't know how to divy up points but those who responded to my question provided valuable information and education to me.  Please advise on what to do re points.

I would keep the question open until things are finally rewired.  I was the observer of the above process so there may be some errors in the process explanation.  I will ask the techs to check it for corrections.  Thanks to everyone for responding.  It was a relief to know I wasn't alone.

I would be glad to answer or get one of the techs to answer any questions.  At least one of the techs is an experts-exchange member as was able to follow the discussion.
Divying up points is entirely up to you.

I always like to see the accepted answer as the correct answer to the question. If that is in your post and not in the expert's posts, then accept multiple answers with your post being the accepted answer and the experts who provided good advice to you as assists.

 
The problem turned out to be wire frayed with on/off on/off result at one drop port, plus a 24 port switch having a long ethernet cable looped into two ports.