Link to home
Start Free TrialLog in
Avatar of glenn22
glenn22

asked on

Network freezes for a few seconds randomly

I've been having a strange issue on our network that has been causing a lot of problems for some time now. Throughout the work day there will be what i can only describe as network "freezes" at random times. There does not appear to be any regular pattern to the freezes, sometimes there will be 3 in the space of 15 minutes, other times only 1 or 2 a day. These "freezes" only last for a few seconds, but it is long enough to cause internet connections to drop, webpages to fail to load, etc. We have a DSL connection AND a cable connection to the internet, and they are connected to our network in separate locations. Both connections experience these freezes so it doesn't appear to be a modem issue. Also, these freezes do not occur on weekends, which leads me to believe it's something being used by our staff on weekdays. I've tried using Capsa 7.0 to analyze the network during these freezes, but it hasn't yielded anything useful, the network traffic is low, utlization is at 3% or less, no flooding occurring.

Suggestions?
Avatar of Benji_
Benji_
Flag of United Kingdom of Great Britain and Northern Ireland image

Have you check that its not your ISP? maybe setting some monitoring solution to check? Does it affect internal file transfers
Avatar of glenn22
glenn22

ASKER

Yes it does affect file transfers internally. and like I said we have 2 ISP accounts (with 2 separate companies) and these freezes occur on both so its highly unlikely to be the ISP.
Sorry I was assuming VPN between 2 sites.

Are you using a Windows Server, Standalone Pc's with windows Shares. can you detail your network layout a little?
:)
Avatar of Josef Pospisil
Does every PC on the network get this network freeze? Do you have an old PC somewhere where full / half duplex is set manually?

http://en.wikipedia.org/wiki/Duplex_mismatch

How did you monitor the network traffic? Did you use port mirroring? Maybe there is a network card gone crazy (defective)? I think that only good analysis of the network traffic will help you get out of trouble.

also netstat -s, netstat -e could help you...
Avatar of glenn22

ASKER

No VPN is in use. It is a Windows network with the majority of the PCs on Windows XP SP3, but some Windows 7 PCs. The Servers are all Windows 2008 Enterprise. The physical network is spread out over approximately 1km of land (connecting several buildings) with wireless (cisco aironet) connection where we cannot physically wire, and fibre connections where we can.

It is possible there is an old PC with a duplex mismatch, but I'm not sure how to track that down? I'm using port mirroring to monitor the port from which all network traffic enters our server stack and the firewall out to the internet.
Is anyone loading any big exchange mailboxes ?

Are you using Offline Files, and someone is logging of and sending lots of data back to the server?

Had this a couple of times,  AntiVirus also on the server side can cause this ... Its  little bit of a needle in a haystack to eleminate, port mirroring is the right way but im thinking if its a duff network card it would be happening more frequently...
Avatar of glenn22

ASKER

We do have ESET Antivirus remote administrator running on our server and each client connects to it every 10 minutes to update their status and download updates if necessary. How could this cause this sort of issue?

We have no exchange servers in use.

 It's possible offline files are in use, but again the port being mirrored is the one which all network traffic passes through to get to the server stack and it shows very little traffic before during and after the network freezing. Yeah I know this is definitely a needle in a haystack as I cannot cause the error to happen myself and it occurs at random times for such a short time period.
I'm currently on a site (while writing this) where the server has been crashing every 40mins to 1hr. I have Disabled ESET NOD4 and the issue has just disappeared for the last 3hours.

I't seems that something that they have updated is having a little big of effect with servers, try disabling the services on the server side fora short period (check firewall is secure) see if that helps any ?

I can feel our frustration...
Does ping while the network freezes work?
How many of your servers are multihomed?
Avatar of glenn22

ASKER

I will try disabling the ESET services and update later how it goes...
Avatar of glenn22

ASKER

I tried running ping with the -t flag Jelcin and saw no interruption (reponse times remained the same, no timeouts) during the freezing times, but the freezing is very short term... so not sure if I am getting accurate info there.
Avatar of glenn22

ASKER

no multihomed servers at all.
Make sure your servers are updated to service pack 2. there was a bug in SP1 that caused the same issue.
Avatar of glenn22

ASKER

Update to this issue, I tried disabling the Anti-virus services (all of them including the remote administration service) and left it for a day... I still had 2 "freezes" during that time, so it doesn't seem to be the issue unfortunately.

As for Service Pack 2, I checked the servers and it does appear that one of the servers only has service pack 1, so I am updating it now, will check back in and let you all know how it goes.
Avatar of glenn22

ASKER

I installed service pack 2 on the server which didn't have it, but today we are experiencing the same "freezes" again, so that didn't seem to fix the issue.
On the DC, go to the command prompt and type:

DCdiag /V

and

DCdiag /test:DNS

Provide the errors you might see.

Another command you might try is Netdiag /v

And look for errors on that.
Avatar of glenn22

ASKER

@ ChiefIT

The DCdiag /V returns all tests passed

The DCdiag /test:DNS returned srs-local.local (our domain) failed test DNS.
 Here's the snippet of the returned results
         * rIDNextRID: 2157
         ......................... SR-DC-2 passed test RidManager
      Starting test: Services
         * Checking Service: EventSystem
         * Checking Service: RpcSs
         * Checking Service: NTDS
         * Checking Service: DnsCache
         * Checking Service: DFSR
         * Checking Service: IsmServ
         * Checking Service: kdc
         * Checking Service: SamSs
         * Checking Service: LanmanServer
         * Checking Service: LanmanWorkstation
         * Checking Service: w32time
         * Checking Service: NETLOGON
         ......................... SR-DC-2 passed test Services
      Starting test: SystemLog
         * The System Event log test
         Found no errors in "System" Event log in the last 60 minutes.
         ......................... SR-DC-2 passed test SystemLog
      Test omitted by user request: Topology
      Test omitted by user request: VerifyEnterpriseReferences
      Starting test: VerifyReferences
         The system object reference (serverReference)
         CN=SR-DC-2,OU=Domain Controllers,DC=srs-local,DC=local and backlink on
         CN=SR-DC-2,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configurat
ion,DC=srs-local,DC=local
          are correct.
         The system object reference (serverReferenceBL)
         CN=SR-DC-2,CN=Domain System Volume (SYSVOL share),CN=File Replication S
ervice,CN=System,DC=srs-local,DC=local
         and backlink on
         CN=NTDS Settings,CN=SR-DC-2,CN=Servers,CN=Default-First-Site-Name,CN=Si
tes,CN=Configuration,DC=srs-local,DC=local
         are correct.
         ......................... SR-DC-2 passed test VerifyReferences
      Test omitted by user request: VerifyReplicas

      Test omitted by user request: DNS
      Test omitted by user request: DNS

   Running partition tests on : ForestDnsZones
      Starting test: CheckSDRefDom
         ......................... ForestDnsZones passed test CheckSDRefDom
      Starting test: CrossRefValidation
         ......................... ForestDnsZones passed test
         CrossRefValidation

   Running partition tests on : DomainDnsZones
      Starting test: CheckSDRefDom
         ......................... DomainDnsZones passed test CheckSDRefDom
      Starting test: CrossRefValidation
         ......................... DomainDnsZones passed test
         CrossRefValidation

   Running partition tests on : Schema
      Starting test: CheckSDRefDom
         ......................... Schema passed test CheckSDRefDom
      Starting test: CrossRefValidation
         ......................... Schema passed test CrossRefValidation

   Running partition tests on : Configuration
      Starting test: CheckSDRefDom
         ......................... Configuration passed test CheckSDRefDom
      Starting test: CrossRefValidation
         ......................... Configuration passed test CrossRefValidation

   Running partition tests on : srs-local
      Starting test: CheckSDRefDom
         ......................... srs-local passed test CheckSDRefDom
      Starting test: CrossRefValidation
         ......................... srs-local passed test CrossRefValidation

   Running enterprise tests on : srs-local.local
      Test omitted by user request: DNS
      Test omitted by user request: DNS
      Starting test: LocatorCheck
         GC Name: \\SR-DC-2.srs-local.local
         Locator Flags: 0xe00013fc
         PDC Name: \\SR-DC-1.srs-local.local
         Locator Flags: 0xe00013fd
         Time Server Name: \\SR-DC-2.srs-local.local
         Locator Flags: 0xe00013fc
         Preferred Time Server Name: \\SR-DC-2.srs-local.local
         Locator Flags: 0xe00013fc
         KDC Name: \\SR-DC-2.srs-local.local
         Locator Flags: 0xe00013fc
         ......................... srs-local.local passed test LocatorCheck
      Starting test: Intersite
         Skipping site Default-First-Site-Name, this site is outside the scope
         provided by the command line arguments provided.
         ......................... srs-local.local passed test Intersite

C:\Users\srsadmin>DCdiag /test:DNS

Directory Server Diagnosis

Performing initial setup:
   Trying to find home server...
   Home Server = SR-DC-2
   * Identified AD Forest.
   Done gathering initial info.

Doing initial required tests

   Testing server: Default-First-Site-Name\SR-DC-2
      Starting test: Connectivity
         ......................... SR-DC-2 passed test Connectivity

Doing primary tests

   Testing server: Default-First-Site-Name\SR-DC-2

      Starting test: DNS

         DNS Tests are running and not hung. Please wait a few minutes...
         ......................... SR-DC-2 passed test DNS

   Running partition tests on : ForestDnsZones

   Running partition tests on : DomainDnsZones

   Running partition tests on : Schema

   Running partition tests on : Configuration

   Running partition tests on : srs-local

   Running enterprise tests on : srs-local.local
      Starting test: DNS
         Test results for domain controllers:

            DC: SR-DC-2.srs-local.local
            Domain: srs-local.local


               TEST: Basic (Basc)
                  Warning: The AAAA record for this DC was not found

               TEST: Records registration (RReg)
                  Network Adapter
                  [00000006] Intel(R) PRO/1000 EB Network Connection with I/O Ac
celeration:

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.12:
                     SR-DC-2.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.dc._msdcs.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._udp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kpasswd._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs.srs
-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.Default-First-Site-Name._sites.srs-local.loc
al

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.12:
                     gc._msdcs.srs-local.local

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.250:
                     SR-DC-2.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.dc._msdcs.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._udp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kpasswd._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs.srs
-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.Default-First-Site-Name._sites.srs-local.loc
al

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.250:
                     gc._msdcs.srs-local.local

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.250:
                     SR-DC-2.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.dc._msdcs.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._udp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kpasswd._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs.srs
-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.250:
                     _kerberos._tcp.Default-First-Site-Name._sites.srs-local.loc
al

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.250:
                     gc._msdcs.srs-local.local

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.12:
                     SR-DC-2.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.dc._msdcs.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._udp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kpasswd._tcp.srs-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs.srs
-local.local

                     Warning:
                     Missing SRV record at DNS server 172.20.72.12:
                     _kerberos._tcp.Default-First-Site-Name._sites.srs-local.loc
al

                     Warning:
                     Missing AAAA record at DNS server 172.20.72.12:
                     gc._msdcs.srs-local.local

               Error: Record registrations cannot be found for all the network
               adapters

         Summary of test results for DNS servers used by the above domain
         controllers:

            DNS server: 192.168.72.7 (sr-gateway.srs-local.local.)
               1 test failure on this DNS server
               PTR record query for the 1.0.0.127.in-addr.arpa. failed on the DN
S server 192.168.72.7
         Summary of DNS test results:

                                            Auth Basc Forw Del  Dyn  RReg Ext
            _________________________________________________________________
            Domain: srs-local.local
               SR-DC-2                      PASS WARN PASS PASS PASS FAIL n/a

         ......................... srs-local.local failed test DNS

Open in new window

OK, the freezes seem to come from some missing DNS records:

IPv6 =AAAA as IPv4= Host A DNS records (I hope you followed that).

SRV records are short for SeRVice records for DNS...

These are the fixes you need to overcome:

You must re-register your SRV records, and then enable IPv6 on the nodes that are not registering the AAAA records, and then initiate an registration of DNS records on the server for those nodes. I would fix SRV records first, because these are domain services like the DNS server service and Domain controller authentication server services.

SRV records fix:

Go to these domain controllers/DNS servers:
172.20.72.250
172.20.72.12

First off, look in the DNS snappin forward lookup zone for any greyed out folders. If any greyed out folders for MSDCS.. STOP right here, and please report.

If no greyed folders, let's continue. Next (WITHOUT EVER LOGGING OFF) follow these instructions on both DCs (one at a time).

1) go to the nic card properties and make sure IPv6 AND IPv4 are both enabled. Go to the advanced properties of IPv6 and IPv4 and select the DNS tab. On both IPv6 and IPv4 select these settings:
-Append primary and connection DNS suffix radio button (selected)
-Append parent suffixes of primary DNS suffix (checked)
-Register this connections address in DNS (checked)

2) Go to the command prompt with elevated privileges and type these commands (in exact order):

IPconfig /flushDNS
IPconfig /registerDNS
Net Stop Netlogon
Net Start Netlogon
DCdiag /fix:DNS   <(I don't remember if this is a PIPE | or a colon :)

3) go to the other server and do the exact same.

Missing AAAA records:
Any client missing AAAA records on the DNS server you need to perform this on:

1) go to the nic card properties and make sure IPv6 is enabled. Go to the advanced properties of IPv6 and select the DNS tab. On IPv6 settings follow select these attributes:
-Append primary and connection DNS suffix radio button (selected)
-Append parent suffixes of primary DNS suffix (checked)
-Register this connections address in DNS (checked)

2) Go to the command prompt of the machine:
Type:
IPconfig /registerdns

ONCE DONE: let's again check the health of DNS:

on DNS servers command prompt (with elevated priveleges), type:
DCDiag /test:DNS

Once the SRV records AND Host records are fixed in DNS, you should see a performance change for the better.
Avatar of glenn22

ASKER

@ ChiefIT

The MSCDS folder is greyed out for the local domain.
ASKER CERTIFIED SOLUTION
Avatar of ChiefIT
ChiefIT
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial