Internet Goes Down Daily at the 5 pm Hour

Initially the Internet has been going down once a month toward the end of the month. Now it is happening more frequently but still during the 5 pm hour. Most recently was today (10/3/11). It progresses to a crawl by 5:15pm but the VOIP is still functioning. Then by around 6:45 ish it starts getting progressively better. The entire network becomes pretty much inoperable - not specific to a computer, etc. - everything is affected including VPN.

I have checked with the ISP - no downages. They did say that they would change our Channel (WAN).
FW logs - dumped its logs but was previously OK.
Server logs - OK.

Dedicated Wireless T3 (3MB up/down)
SonicWALL TZ 170 STD OS - one OS release back.
Microsoft Small Business Server 2003 - all patched
MS Windows XP - all patched.

Any ideas...I'm lost on this one.
LVL 32
Blue Street TechLast KnightAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

MetallimirkCommented:
Are there any SQL dts jobs which might be transfering large packets of information (perhaps daily invoices, or olap data cubes, etc.). Perhaps its something being conducted by a server. Alternatively, what is the internet bandwidth usage during that hour? Are you maxing out? Most ISP's can provide a bandwidth usage report by hour.
Radhakrishnan RSenior Technical LeadCommented:
I hope you are using firewall / proxy server for controling your internet access, Is there any schedule set in between this time for not allowing internet access?
Also, It's worth to check your router / switch for any packet drp down during this period. Is the there any scheduled backup / AV scan running on the network?
AntonInfCommented:
DO you have an antivirus server such as Trend, this could be broadcasting info and flooding your network...

It happened to me but around 3pm, i thien discovered it was the trend micro server version 7, once I rebooted the server the network went backt o nromall..

Just a thought..
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Blue Street TechLast KnightAuthor Commented:
@ALL: thank you for the quick responses!

@Metallimirk: no database services are being used. I will check with the ISP about the usage.

@radhakrishnan2007: firewall yes, but no proxy server. There are only 5 users, 18 pcs. No schedules that block internet access. All maintenance is scheduled between 11p-4p. Unfortunately, the FW log had just dumped before the issue. I will check it tomorrow.
Blue Street TechLast KnightAuthor Commented:
@AntonInf: AV server is ESET business edition setup in a mirror configuration to eliminate bandwidth consumption. What type of diagnostics do you suggest I run.
Blue Street TechLast KnightAuthor Commented:
@radhakrishnan2007: typo maintenance window is from 11pm-4am.

@Metallimirk: ISP said they cannot provide bandwidth reports because our building is one of their hubs, therefore sharing the main switch...so bandwidth reports show the entire building rather than just our site.

I have to locate the root cause...any ideas on what troubleshooting steps I can do?
Radhakrishnan RSenior Technical LeadCommented:
Ok..Few things you can monitor, 1) Try a continious ping on your default gateway and see any drop down when the issue occurs.
2) Restart your ISP modem, Switches, Firewall and see whether it makes any different.
3) This could be cause of virus, Update your AV with latest definition and run full scan on your server and workstations. If workstations also affected with virus then it will block network conectivity for entire network randomly.
4) Call your ISP and show them the status when the issue occurs, it could be their modem issue.

I hope you don't have fault tolerance for internet , Otherwise you could easily replicate the issue and able to rule out the issue.

Try all this method and let us know if you need any help.
Blue Street TechLast KnightAuthor Commented:
@radhakrishnan2007: RE #2, I will try to get our ISP to reboot the switch, however, it is a unique situation where the ISP's hub is in the same location as our building meaning we share the ISP's switch with all the tenants in the building so it may be trickery for them to do.

RE #3, there are no viruses - we are very proactive with AV, however, I did notice 2 computers that were not part of the domain. I am investigating these devices...I suspect one is an iPad.

RE #4, ISP shows no issues on there end.
Blue Street TechLast KnightAuthor Commented:
I ran BPA it said LANNIC was incorrect - its pointing to a GUID that has not IP address but could this cause such odd behavior? Should this be changed if we are not running Exchange on this box?
Blue Street TechLast KnightAuthor Commented:
@radhakrishnan2007: FYI: sent 10,000 packets... only 4 were lost.
Radhakrishnan RSenior Technical LeadCommented:
when the issue occurs can't you disconnect the isp cable and connect it to a single system and see it,s working fine without any issue.  If it's working fine then you can check your switch and firewaall.
Blue Street TechLast KnightAuthor Commented:
Yes that was ny plan except now it has not happened for the past 2 days...this is so bizarre.
ChiefITCommented:
Your wireless provider was right.

There is probably some electronic broadcast about 5:00 that is interfering with your wireless signal. This could be  a radar, or a cellular connection. So, changing the channel would be a great idea.

You can use a spectrum analyzer for RF. But that requires good knowledge on how to use it. This one is cheap but can do the trick if you are on the 2.4 Ghz band:

http://www.solidsignal.com/pview.asp?p=airview2&d=Ubiquiti-AIRVIEW2-USB-2.4GHz-Spectrum-Analyzer-%28AIRVIEW2%29
Blue Street TechLast KnightAuthor Commented:
@ChiefIT: Just to clarify our network is primarily LAN based Ethernet, WLAN makes up 1 or 2 devices. So the ISP originally said they would change their channel on the WAN side because they provide us with wireless broadband not to be confused with WLAN.

They have decided not to change the channel now because since we enabled pinging on our Firewall they are not seeing a problem and since I have enabled pinging it has not occurred again - I believe to be completely coincidental.

It still makes no sense to me that up until now for days on end around 5pm the internet begins to slow down and then stop all together for at least VPN traffic & Internet traffic but remain intact for VOIP. I know QoS play a role w/VOIP but it still does not explain why the VOIP seems unaffected by this isse.

No one here has enough know how to run a spectrum analyzer for RF. :(
Rob WilliamsCommented:
If you have a shared connection with other tenants it may be something over which you have no control
Blue Street TechLast KnightAuthor Commented:
We are on a dedicated wireless bonded T1 (2MB up/dn) w/ 5 IPs.

Our location is unique in that we are in the same building as our ISP's main Hub for the area so we have 5 dedicated ports (for our 5 IP addresses) on *their* switch within the building. From there our feed is supposed to go straight into our unit. The building management used to manage all the internet feeds but they have since relinquished controls back to the ISP - so we deal directly with them. That said, a few days ago, our building management gave us a bizarre phone call stating that our internet was going to be taken down due to emergency maintenance. This is suspicious to me because why and how could they bring us down if we are separate from them? Unless after the ISP switch building management has us going through one of their switches? Not sure if this info helps you better understand the situation.
ChiefITCommented:
Who is your internet service provider?
Blue Street TechLast KnightAuthor Commented:
TelePacific (http://www.telepacific.com) serving California & Nevada. It used to be Covad Wireless before TelePacific acquired them.
ChiefITCommented:
Yes, I am very familiar with Covad.. I did a lot of work for them when I worked for QWest Communications as a High Speed Data Tech.

This is the type of service that is provided to your building. It's called WiMax..

http://www.telepacific.com/offer/data-network/wireless-internet-access.asp

Telepac, also provides VoIP and teleconferencing, but usually puts that on a PSTN network using a PRI ISDN-type connection. This may explain the problem with your internet in comparison to NO VoIP issues. The reason for using an ISDN type connection for VoIP is because it's much easier to control back to a central office for QoS if this is a designated line.

These types of antennas, or cell towers that is seen on that web site, are leased space on someone's office building. So, if there are ANY presumed outages, maintenance, or problems, the building manager would be contacted, and they may/may not contact you. So, it is my guess, someone within your building called in a trouble ticket on the 5PM issue and the ISP created a trouble ticket for it for the wireless provider to fix, and it was fixed. This also explains why your building manager knew about it.

Your D-Mark is your switch, in the phone closet. So, if issues go beyond that, The best you can do is contact your ISP and complain. You can't work on their gear.
ChiefITCommented:
Oh, forgot to mention, the 5pm slowness could be caused by almost anything. That's what the T1 is for. It's usually used to create bandwidth caps on people so that you don't run into a bandwidth hog on the broadband connection. Since this is an RF broadband connection, it could mean that too many users or shared bandwidth could flood the broadband if they are not capped at their service provided agreement. In your case, it sounds like a 3MbPS agreement cap.

A bandwidth hog could be someone performing a large data transaction, like a site-to-site file transfer at five PM. or a video streaming of traffic by the Department of Transportation. If VoIP is on a QoS Ethernet link, it will take precedence over other data and not be effected as much. However, most video conferencing and VoIP with Covad used ISDN connections. And VoIP usually goes back to a centrix IP based Telephone PBX instead a locallized IP based phone PBX. That depends upon who your Internet Telephony Service Provider (ITSP) is. In some mid to large enterprise, the company is their own ITSP and owns their own IP based phone system.
Blue Street TechLast KnightAuthor Commented:
Interesting...

The only service we get from TelePacific is the Fixed Wireless. The VOIP service is through RingCentral. I was assuming that was not going out on a different network...was I mistaken and does this change your assumptions? So are you saying that VOIP traffic is handled different from other types of traffic even if the ISP is not handling the VOIP service?

I checked w/TelePacific they report no issues with our line in the past or present. We are on a dedicated line so if someone else in the building did call to report a service outage: 1) it should not affect us, 2) all the other businesses close at 5 pm but we stay open until 6:30 pm, and 3) this has been occurring every day except this past Friday & Saturday.

I spoke with another engineer at TelePacific who said this was pretty typical service wise to have issues like this, but he was the only one – everyone there doesn’t sing the same tune, which makes it more difficult to bring to the boss.

When I had TelePacific look at the line when it was starting to go down the bandwidth was minimal.

The only other thing I can think of is to switch to good ole solid, reliable copper. A traditional T1 is more stable than a fix wireless, however, if I have not properly isolated the root because the switch will only create more issues in terms of frustration & driving up costs with my boss. Any other ideas?
ChiefITCommented:
Your ISP leases lines:
Your ISP is not literally an ISP. They lease lines from the CLEC or other service providers at cost to the CLEC. So, if you have a trouble ticket, I would imagine you would have a broadband over Cellular type tech come out, like a Clearwire Tech, (now known as Clear). VoIP is probably handled by an ISDN line separate of the Internet connection by using an ISDN, or similar line back to the local phone company central office, (probably CentryLink, formerly known as QWest Communications). In other words, the CLEC and broadband/WiMax carriers are not typically TelPac. Instead they leas lines from Centrylink for PSTN, and probably Clear for Broadband wireless.

Typical WiMax Speeds:
A typical setup would be 3G speed WiMax for internet, then ISDN for teleconferencing or VoIP. However, with your 3G WiMax, you can still use that for VoIP. You just have to get back to the ITSP (Internet Telephony Service Provider) that converts that IP based signal through an IP Based PBX and into PSTN plain old telephone service trunks.

EMI:
WiMax is a microwave based station. This means it is line of site communications, not omnidirectional, like Wifi. It operates in the 2Ghz frequency bands, usually. So, things like RADARS (especially Doppler radars), and microwaves will interfere with it. It's my guess, you probably have a news station that provides live Doppler video images at around 5:00 to show news. So, your signal is probably waxed by a meteorologist. Airports are another place where Wimax often don't work well. Another thing that could cause this problem is if a computer has the exact same IP as the gateway to the internet. This would cause a problem with where the traffic is routed through. So, check to make sure that there are NO computers with the same IP as your gateway. Since WiMax is Microwave, it's usually a pretty solid signal. WiMax is within the licensed frequency bands, (not like WiFi that uses Cell phone frequencies).

Broadband versus T1:
Since this is WiMax Broadband, then you will probably have anywhere from a 2G-4G connection, with the antenna on the roof pointing to the internet service provider, and hooked up via a Switched network. The T1, is usually used for a PSTN network to channelize the signal. So, I am certain they are trying to tell you that the signal is "Like T1 speed", not literally using a T1.

-Cost wise, a T1 could be considerably more expensive.

Check for a conflicting IP, then look around your area for a radar dome, ships, airport radar, etc... Radars are the biggest predator to WiMax.
Blue Street TechLast KnightAuthor Commented:
UPDATE: The incident occurring today again at 5:15p and last until 6:45p – I had both CLEC (TelePacific) and SonicWALL on the phone during the incident.
I tested the connection bypassing our network hooking a laptop straight into the feed and internet was blazing fast on all the same sites that present where extremely sluggish (5 minutes to perform a speed test). Then I plug the feed back into the SonicWALL and the issue occurred again. This is only one test but it at least points to something within the network, off the cuff.

RingCentral is a cloud based VOIP service. There are no additional lines laid in order to use their service you simply need an Internet connection – any internet connection. You can add IP phones but everything goes out the same pipe, just routed differently (ports 5060-5090, UDP and 16384-16482, UDP).

VOIP is one of the most sensitive types of data there is…therefore it makes no logical sense how or why the VOIP would remain up but everything else on the net would tank.

There are no outages with the CLEC (TelePacific) – in fact we are on their backhaul that holds 100-300 Mbps. Backhauls are typically very stable. The frequency is 5.8 GHz, UNII band frequency. This has only occurred during the last month and a half. We have had this carrier since February this year.

I still cannot figure out why the network functions perfectly all day and night except between 5:15p and 6:45p every day.

Bandwidth usage is very low during this time as well.

I have not used network analyzers before…what do you recommend that are easy to use and free?
Blue Street TechLast KnightAuthor Commented:
@ALL: What else can I do aside from running BPA to insure the server config is not the issue?
I have not used network analyzers before…what do you recommend that are easy to use and free?
 
ChiefITCommented:
I will answer that in just a moment:

This has been a thought on my mind since I started working on this thread. Some businesses have a policy to download updates from an internet server (like Microsoft Updates) at certain times. If these machines are imaged machines or there is a policy in place for all machines to logon and download the updates daily at 5pm (let's say), then you could tank a network. Imagine every computer on your network downloading updates at 5PM (after the normal work day).

Another thought is a common issue with Cisco networking devices (switches and routers). However, this particular problem would most likely be seen throughout the day. If the duplex settings of cisco switches don't match other devices, (including Sonic Walll), Then traffic can be brought to a serious crawl. By plugging in beyond sonic wall, you are able to bypass this issue. This could be a missed configuration between your switches and Sonic Wall, like duplex settings.

To answer your question, there is a great/free bandwidth monitor. It requires SNMP. I don't know if you have SNMP set up or practiced using it. You can find this on Solar Winds Web site and it's called "Real-time Network Analyzer"... On that same page, there is a Windows based SNMP enabler to allow easy enabling of SNMP:

http://www.solarwinds.com/products/solarwinds_free_tools/
Blue Street TechLast KnightAuthor Commented:
@ChiefIT: Thanks for your response. There are no services or updates running at 5p or even close to it. All updates for setup on a mirror server to avoid exactly that (bandwidth surges)...meaning the server pulls down all the updates then distributes them to the clients.

I could not find “Real-time Network Analyzer” but found “Real-Time NetFlow Analyzer”. Are these the same?

Not really familiar w/SNMP. Should I execute this: http://support.microsoft.com/kb/324263 - could only find it for Server 2003 not SBS 2003?
ChiefITCommented:
Real Time netflow is what I meant. My bad.


You could try to run Net Flow and see the results. But, I am pretty sure you need SNMP to communicate with networked devices. I ran Netflow off a client computer, not a server. I don't run network scanner software on servers because this can take up some resources to run that the server can't afford.
Blue Street TechLast KnightAuthor Commented:
I am donloading and running the Real Time netflow app now.

Is there anything else i can do to clear the server from being the issue besides running MS Best Practice Analyzer???
ChiefITCommented:
Not really:

You can check the server health using these command lines at the command prompt:

DCdiag /test:DNS
DCdiag /V

Look for anything that doesn't pass.

DCDiag is a part of the system support tools and is used to diagnose server issues. It's pretty accurate and very helpful if you can interpolate the output.

If you have to download it from microsoft, make sure you have the right one (64 bit or 32 bit).
Blue Street TechLast KnightAuthor Commented:
DCdiag /test:DNS - passed.
DCdiag /V failed in some parts, see attached.

I have read that "IsmServ Service is stopped" can be dismissed in SBS.
"Failed test systemlog" was the other issue.
Thoughts?
DCdiag-V---Results.txt
ChiefITCommented:
The following problems indicate issues with your NIC drivers or bindings. I see DNS and Netbios and time services:

So, your problem is likely the server. These services need to be running.  


         * Checking Service: Dnscache
         * Checking Service: NtFrs
         * Checking Service: IsmServ
            IsmServ Service is stopped on [S1]
         * Checking Service: kdc
         * Checking Service: SamSs
         * Checking Service: LanmanServer
         * Checking Service: LanmanWorkstation
         * Checking Service: RpcSs
         * Checking Service: w32time
         * Checking Service: NETLOGON
         ......................... S1 failed test Services      
Blue Street TechLast KnightAuthor Commented:
The only service stopped is IsmServ, what did you mean by you see DNS and Netbios and time services? I thought IsmServ was only used in multiple AD environments. So I enabled it and everything passed the tests.

Now that we have found this what else should be checked? Do you think this was the root cause?

Thanks.
ChiefITCommented:
* Checking Service: Dnscache
         * Checking Service: NtFrs<<<<DNS related, FRS relies upon DNS to replicate
         * Checking Service: IsmServ
            IsmServ Service is stopped on [S1]
         * Checking Service: kdc<<<Kerberos relies upon DNS
         * Checking Service: SamSs
         * Checking Service: LanmanServer<<<Lan Manager relies upon Netbios
         * Checking Service: LanmanWorkstation<<<Lan Manager workstation relies upon Netbios
         * Checking Service: RpcSs<<<RPC locator relies upon Netbios
         * Checking Service: w32time<<<   Time server is a broadcast on port 123 from the PDCe
         * Checking Service: NETLOGON<<<Netlogon relies upon NETBIOS, You would think DNS, but it's Netbios

Three different non-routed Networking protocols are having problems according to your DCdiag.

The other errors you are seeing simply mean there are errors within your System Event logs that may/may not be addressed:
Blue Street TechLast KnightAuthor Commented:
I see. Thanks for your insight...I read that as DCDiag was checking those services but only found an issue with IsmServ.

Do you suspect this was the root cause?

I will post an update after 6:45 pm PST today to see if there was any improvement.
ChiefITCommented:
ChiefITCommented:
In other words, it's plausible that this service is important, but these protocols have their own way of communicating with eachother (if you ask me).
Blue Street TechLast KnightAuthor Commented:
No change - still the same issue.
I put a laptop on the wireless, which bypasses the server and connects directly to the SonicWALL which handles the WLAN DHCP, but the DNS in the SonicWALL is first pointing back to the DNS server in the SBS box (so that WLAN users can gain access to the LAN resources), then it points to the ISPs DNS servers.

This, plus my test results in post http:#a36946603 would indicate to me that it is a server issue.

Any other ideas?
Alan HardistyCo-OwnerCommented:
Have you got DNS forwarders set on your DNS Server to forward DNS lookups to your ISP's DNS Servers?
Blue Street TechLast KnightAuthor Commented:
Hi Alan!
Yes i do.
ChiefITCommented:
What service pack do you have installed on the server. Service pack 1 had problems with networking.
Blue Street TechLast KnightAuthor Commented:
SP2
Blue Street TechLast KnightAuthor Commented:
SBS is fully patched. FW has the latest firmware.
Dr. KlahnPrincipal Software EngineerCommented:
I saw a similar situation many years ago in a DEC installation.  The hardware faulted in the evenings after 6 PM, but it all tested perfectly, and nobody had the slightest idea what was causing the problem.

One night as we sat speculating on extremly unlikely causes, the janitor came into the computer room, plugged his floor buffer into a protected socket ...

And the moral of the story is, sometime's it's a power problem, but it's not the power's fault.
sjklein42Commented:
@DrKlahn, haha.  Another DEC story: At the Maynard Mill headquarters (in 1976), one of our new hard drives (one of those washing machine units) in the 5th floor machine room frequently had a soft crash, always around 3pm.

 Eventually we realized it was caused by the big trucks that were banging into the bumpers on the 3rd floor loading dock.  They were actually shaking the building, and the hard drives got errors.  The fix was to rotate the drives 90 degrees so they would not be sensitive to the trucks hitting the building.  It worked.  The funny thing is, that a world-wide service bulletin went out announcing that if a hard drive (anywhere in the world) was getting a lot of soft crashes, the fix was to rotate it 90 degrees.  So all over the world, people were moving around their hard drives thinking it would fix their problem, but it was all because of the trucks hitting the loading dock in Building 5.
Dr. KlahnPrincipal Software EngineerCommented:
N.N. from Philip Morris used to tell a similar story at DECUS RSX Magic sessions about the Dempster Dumpsters at her site.  I believe it was RP06s, or possibly RM05s in that case ...
Blue Street TechLast KnightAuthor Commented:
@DrKlahn: Was your power issue causing the Internet to slow down and gradually get worse (on the downstream only) to the point on non-functionality and then build back up to a functional state.

@sjklein42: These are some great stories! :)

@ALL: Help me proof this out logically.

It’s not likely the ISP – they show very small bandwidth consumption during the trouble period. A direct connect test also proved that the feed was blazing fast and then once plugged back into the network the issue was present.

It’s not likely the FW – I tested it on the wireless and it was slightly diminished from normal performance but nothing like the LAN computers performance. The FW handles WLAN DHCP and DNS is pointed first to the SBS server (LAN DHCP, DNS, DC & AD) then to the ISP DNS servers.
It’s not any of the Switches or Ports – bad ports would result in different ways such as maxing out concurrent connections and other data link errors but not this.

It’s not Cabling – again bad cabling would result in continuous issues not specific to a time each day.

This only leave the SBS box. DNS query tests both pass. Microsoft SBS Best Practice Analyzer has no critical errors and a few warnings related to Exchange since we don’t use Exchange on this box any longer and its services are disabled this is most likely the cause of the warnings. DCdiag /test:DNS & DCdiag /V all pass. Outside of the 4:45p to 6:45p it works like a charm.

What else do you suggest? Correct my logic in trying to pinpoint this. Please!
ChiefITCommented:
If DCdiag /v, and DCdiag /test:DNS are both good, then it has to be a switch to network configuration.

You could also look at Netdiag /v and look for errors.

Those tests are pretty concrete of a good server.

I also wanted to ask how close the time is on the network?

This server looks OK except those services you may need. So, something must be tanking the server about that time. It has to be a scheduled task, like an update or something like that that is busying out the NIC.
BawerCommented:

diverseit,
you said u r having wireless bonded T1 connection, so the ISP may have installed a T1 converter or a modem ,
as far i think and i faced the same once with E1 link, the problem was with that T1 converter or modem, since it cant get too much load and it hangs sometimes,
try to replace that by telling the ISP and check if it help you
:)
best of luck
Blue Street TechLast KnightAuthor Commented:
My mistake - poorly relayed info. We have a fixed wireless solution. More info here: http://www.telepacific.com/offer/data-network/wireless-internet-access.asp with overview here: http://www.telepacific.com/offer/data-network/wireless-demo.asp 

Because we are in the same location as their backhaul so we get a RJ45 cable as our internet feed straight from their router and switch. They show no issues on their end when they perform ping plotters etc.
Glen KnightCommented:
My first thoughts are also DNS related.

Can you post the results of DCDIAG, NETDIAG and IPCONFIG /ALL from the SBS server please.

Also IPCONFIG /ALL from a workstation.

Also, is the server connected directly to the sonicwall or is there another switch in between? Are you running SBS in dual NIC or single NIC mode?
Blue Street TechLast KnightAuthor Commented:
Hi demazter! Thanks for your response.

Attached are the server and workstation diags you requested. I also included a WLAN PC ipconfig /all as well.

There are two unmanaged switches between the server and the firewall. One is a DELL powerconnect 24-port the other is a random named switch…(not on-site so I don’t know the name.)

I believe it is running in single NIC mode. It only shows one Local Area Connection but I have noticed four LANNICs present. One main LANNIC and then a second one that does not have a specified gateway (you will see it under the diags attached. The other two do not have any settings.

Server-DCDiag.txt
Server-NETDiag.txt
Server-IPConfig-all.txt
LAN-PC-IPConfig-all.txt
WLAN-PC-IPConfig-all.txt
Blue Street TechLast KnightAuthor Commented:
@ChiefIT: Sorry, I missed responding to your post. The time is spot on. Not sure if there is a time sync test but my workstation, phone and the server all match. There are no Time sync errors in the logs either.

I tried to look at as many PCs as possible (server included) for their associated scheduled tasks and none thus far run even close to the 4:45p time.
ChiefITCommented:
ON THE SERVER:

What's this NIC for?

Adapter : {7967A12E-0F88-4414-8ED7-040B8A4CFE01}

PPP adapter RAS Server (Dial In) Interface:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : WAN (PPP/SLIP) Interface
   Physical Address. . . . . . . . . : 00-53-45-00-00-00
   DHCP Enabled. . . . . . . . . . . : No
   IP Address. . . . . . . . . . . . : 192.168.0.118
   Subnet Mask . . . . . . . . . . . : 255.255.255.255
   Default Gateway . . . . . . . . . :
   NetBIOS over Tcpip. . . . . . . . : Disabled

MAKE SURE IT'S NOT HOSTING DHCP. THIS ADAPTER COULD HOSE UP YOUR DC, BECAUSE WITH IT ACTIVE YOUR DC IS MULTIHOMED.
----------------------------------------------------------------------------------------

Within the DC's DNS MMC snapin, let's get a snapshot of your DNS forwarders. If you are using the ISP's, they should look exactly like this:
216.237.6.36
207.47.112.186

________________________________

And why in the world would a WAN PC be using YOUR SERVER's DNS for it's preferred DNS server?
        DNS Servers . . . . . . . . . . . : 192.168.0.1<<<<<<<<<<<< Your DNS server
                                                       216.237.6.36<<<<<<<<<<< ISP's DNS
                                                       207.47.112.186<<<<<<<<< ISP's DNS

****Why are client's outside Your router, picking up your server as a preferred DNS server????

Blue Street TechLast KnightAuthor Commented:
@ChiefIT: It says it’s from the “PPP adapter RAS Server (Dial In) Interface”…does the SBS VPN use a separate NIC?

How can I make sure that NIC is not hosting DHCP?

Yes the forwarders are exactly the ISPs – nothing else (see attahced). For the WLAN DHCP & DNS are managed by the firewall (SonicWALL) on a 172.16.31.xxx subnet. So for the WLAN to share resources with the LAN I have the DNS server of the LAN (192.168.0.1) specified as the first server on the firewall followed then by the ISPs. (see attached)

This is for the WLAN not WAN.

Server-DNS-Forwarders.JPG
Firewall-DNS.JPG
ChiefITCommented:
Properly multihoming a DC:

Preventing your VPN adapter from trying to provide DHCP:

Please read entire thread.

http://www.experts-exchange.com/OS/Microsoft_Operating_Systems/Server/2003_Server/Q_23806816.html
____________________________________________________________________
ON SONIC WALL:
I still think I would remove YOUR DNS server from the WAN's avail DNS server. In fact, I wouldn't provide DHCP to the WAN side at all. Your ISP should. How are you getting DHCP enabled on the WAN side. That should be controled by the ISP providing it to you.

ON THE LAN SIDE:
On a NAT firewall/router, you don't want to provide DHCP from the Sonic wall to your clients. The reason for this is the Server hosts the SRV records for the domain within DNS. The SRV records are used for domain services, like AD authentication and file replications. However if your router provides DHCP, it will also usually try and host DNS. The router will NOT host the SRV records, and domian services can suffer.

ON THE WAN SIDE:
You can "obtain" an IP from your ISP for the Sonic Wall. the key word is "obtain". This will dynamically update your IP if the ISP changes their IP address. This is separate of your router "providing" dhcp. The ISP has given you five or six public IPs. They are the ones providing DHCP on that subnet..... If you also try to supply it, you will probably cause yourself problems.

COMMENTS:
So, when you physically plug into the WAN side, are you really getting an IP from the ISP, or Sonic Wall? You should be getting it from the ISP. One for your Sonic Wall, and the others that they provide for WAN computers. If you are getting it from Sonic Wall, there will be rogue DHCP server on that Subnet.  The rogue DHCP server will be your Sonic Wall.
Blue Street TechLast KnightAuthor Commented:
@ChiefIT:
I need to make sure we are clear on terminology to alleviate confusion. You keep referring to “WAN” (Wide Area Network) but I am referring to “WLAN” (wireless local area network). SonicWALL sees the LAN & WLAN as two *separate* networks. I am not providing anything to the WAN (wide area network) for DNS or DHCP.

On the SonicWALL:
If I remove it then WLAN (wireless) clients will not be able to gain access to LAN resources (applications, printers, etc.).  I don’t even think they would be able to authenticate via AD – they would be on a completely separate network.

On the LAN Side:
Correct, it has to be one or the other: either the SBS provides DHCP or the firewall does and I understand it is a best practice for SBS to handle DHCP & DNS, which it does for the LAN exclusively.

On the WAN Side:
This is not a cable environment or dynamic IP envir. - we have a dedicated offering with a pool of 10 static IP addresses – they don’t rotate. On SonicWALL with NAT enabled you cannot obtain; you must specify all IP settings. Since the IPs are static there are no leases that take place between the firewall and our ISP. They instructed us how to setup it up, which we followed with a tech on-site. He then verified everything in his scope.

Comments:
If you are referring when I was doing a direct test “plugging into the WAN feed” yes that is straight to the RJ45 cable that they provide us that normally hooks into the WAN port of our firewall (SonicWALL). You then have to configure the laptop to directly plug in by specifying all the settings in the WAN of the SonicWALL (WAN gateway, Subnet Mask, IP address & DNS info) – the WAN feed does not assign anything to you. Furthermore, if you just plugged directly into it you would not be able to access the Internet without providing the settings as mention above in the NIC properties. Again, the SonicWALL is only provided DHCP & DNS info to WLAN (wireless) clients. All LAN clients receive their DHCP & DNS info from SBS server.
ChiefITCommented:
So, this looks like you opened up a gateway back into your LAN from the ISP's side of the sonic wall. Do you think this could be a DDOS attack on your systems?

Don't forget to make sure the VPN adapter has DHCP disabled.
Blue Street TechLast KnightAuthor Commented:
I just temporarily bypassed the SBS server and the Firewall to plug in a laptop to test the connection directly during the issue period. It was only temporary for testing purposes to clear the ISP as being an issue.

If it is a DDOS attack wouldn’t the firewall thwart that, especially with all the gateway services running AV, AS, Intrusion Prevention? We did get compromised weeks ago due to a VNC exposure and an RDP access rule that was erroneously enabled by mistake. IPs from China, Iran and Russia showed tons of failed password attempts on the server in the logs. After eradicating all VNC instances and deleting the erroneous access rule the threats were contained at the firewall properly. Could this play into it though as another attempt?

I think you are on to something with this VPN adapter. It’s almost like a ghost NIC. I went through your post (http:/Q_23806816.html#a22695042 and http:/Q_23806816.html#a22701090) and was unable to find anything to correct. All the values were already correct.

What are your thoughts on inputting 0’s for the DhcpIPAddress & DhcpSubnetMask keys in the other LAN Adapter located here: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\{GUID}?

I attached a screenshot of all the listed GUIDs. The red circled GUID is the correct LAN adapter, the one highlighted in yellow is the GUID you noticed in the diags referenced in your comment here (http:#a36965601). I believe the server only has one physical NIC onboard. The others must be virtual?
Blue Street TechLast KnightAuthor Commented:
AArg. Forgot the attachment.
Server-LAN-Adapters.JPG
Blue Street TechLast KnightAuthor Commented:
From my reading I found that this virtual adapter is used for VPN access and is normal.

I have been full circle now and can't believe that we have not pinpointed this yet!

Any other ideas?
ChiefITCommented:
I am still concerned of the services that are stopped when we ran DCdiag /v... Within event logs of the server, what are we looking at for errors? We might want to review those again, google search them, and possibly find out why these services are stopped....

----------
About the only other suggestion is to start using a bandwidth/Network flow analyzer on the whole lan to figure out what's talking and what's not. This looks like a NIC flood. The flooded nic would show up on this. However, this would require a lot of configurations management. Most of them use SNMP to pass message traffic between the monitor and other nodes on the network. Sonic wall may have a means to communicate with it and provide a network report on what the chatty node is....

Something like this:
http://www.manageengine.com/products/netflow/bandwidth-monitoring.html



Blue Street TechLast KnightAuthor Commented:
Hi ChiefIT, since our findings in the DCdiag /v, I enabled that failed service and now all passed. See http:#a36964970. The only failure now is the systemlog.

I will look into the bandwidth monitoring. I will post results as they come in.

Would a NIC flood be triggered precisely at 4:45p each day? Thanks!
ChiefITCommented:
Those errors just indicate that there are errors within the event logs. They could be "out of date" errors. If you empty the event logs, those errors will go away.

Would a NIC flood trigger at specific points of the day. The answer is maybe. It depends upon services that are required or data transferred on the nic. In any case, you will see what's going on through the network at that given time. This will be the quickest way to figure out what's really going on...

Blue Street TechLast KnightAuthor Commented:
ChiefIT: Ok. Thanks about the errors. I have checked the network at the problem time period and everything looks low as far as bandwidth & packet size transmissions are concerned. I haven't used network analyzer tools so if you are willing to walk me through one it would really help. I have NetworkActiv PIAFCTM 2.2 installed but don’t really know what the results mean. I also installed NetFlow Analyzer but it seems to only work with Cisco…we use SonicWALL. Can you walk me through either NetworkActiv or NetFlow so that I can run these and post results? Thanks.
Blue Street TechLast KnightAuthor Commented:
Also, in NetFlow Analyzer it shows "No device is currently exporting NetFlow / sFlow packets to NetFlow Analyzer. Listening for NetFlow / sFlow Packets at Port 9996" in the Dashboards tab.
ChiefITCommented:
I believe this requires SNMP (simple network Management Protocol) in order to communicate with the sensors... Sonic should allow you to enable SNMP and netflow should show what computers and servers are passing through the router when netflow is communicating with the router.  
Blue Street TechLast KnightAuthor Commented:
It's not working. SNMP is enabled on the server but there is no option that i can find for the SonicWALL.
ChiefITCommented:
It's been a while since I configured network management and monitoring solutions on a LAN/WAN. We might have to revert to the administrator's guidebook in order to make sure we are configuring it right...

http://www.solarwinds.com/documentation/NetFlow/docs/NetFlowAdministratorGuide.pdf
Blue Street TechLast KnightAuthor Commented:
This is not going anywhere and it’s been 1 month now. I am going to delete this question unless someone feels otherwise. I am going to have to solicit outside help. I appreciate all the suggestions and time. Thank you all again.
Jeffrey Kane - TechSoEasyPrincipal ConsultantCommented:
I'm coming into this a bit late, but thought I'd offer my thoughts after reading through the entire thread.

I realize that you all are trying to look at ANYTHING which may be causing this problem, but to me it makes no sense to focus on anything other than network traffic during the specific time period of the problem.

The easiest way to do this is to run a simple tool from  Windows Sysinternals, TCPVIEW http://technet.microsoft.com/en-us/sysinternals/bb897437

Put this on any machine experiencing slowness and run it during the slow period.

You can couple this with Sysinternals Process Monitor to investigate where the traffic is being initiated:
http://technet.microsoft.com/en-us/sysinternals/bb896645

Both tools are simple to implement and use.  

Jeff
TechSoEasy
Blue Street TechLast KnightAuthor Commented:
@TechSoEasy: Thanks for your input. Here is a brief summary of where the issue is at currently.

I had SonicWALL L2 clear the firewall from being root cause plus it was replaced with a new one and then re-cleared.

The ISP says it’s not them however they changed us to a different port and noticed significant latency improvements overall.

I now have a case open with MSFT to clear the server/workstations from root-cause, which they are in the process of. MSFT downloaded Microsoft Network Monitor 3.4 (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=4865). MSFT said it will capture everything so I think we are ok with the monitoring now unless you think that your recommendations cover a different, non-overlapping scope. Let me know.  
Jeffrey Kane - TechSoEasyPrincipal ConsultantCommented:
The MSNM is definitely the right tool... I was just trying to simplify the approach as much as possible.  

Sounds like you are in good hands.

Jeff
TechSoEasy
Blue Street TechLast KnightAuthor Commented:
@ALL: I may have found the root cause. I will post at the end of this week to make sure it is the final solution.
Blue Street TechLast KnightAuthor Commented:
@ALL: I really appreciate all your comments and suggestions and thought you would all want to know the final resolution since there was so much effort put forth on this one.

I can't explain this, but disabling Java under Services and Startup within msconfig resolved the issue.

Backstory: It was a fluke that I found it. One of the troubleshooting steps from MSFT was to disable all services and startup item and test it. After the test, I re-applied everything and rebooted but I figured I would leave off Java since I did not think we were using it. That was it!

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Blue Street TechLast KnightAuthor Commented:
I can't explain this, but disabling Java under Services and Startup within msconfig resolved the issue.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Networking

From novice to tech pro — start learning today.