Link to home
Start Free TrialLog in
Avatar of notacomputergeek
notacomputergeekFlag for United States of America

asked on

Lose internet connection about every night

For the last couple months, we've been losing internet connection at our small business about every night. The LAN seems fine. The main components are: AT&T Uverse ISP, a Sonicwall TZ 105, a Dell 24 port 100Mb switch with two Gb ports, and a Windows Server 2012.

Rebooting the AT&T modem fixes it, but it may take two or three reboots sometimes. Once it is rebooted, the connection will be fine all day until the next night sometime. It seems as though it started shortly after AT&T switched us from regular service to Uverse service, but that just may be coincidence. AT&T says their lines appear good.

I've tried replacing the AT&T modem (with the help of a technician who verified the modem settings, since we have a router behind the modem), different Ethernet cables, and the router (was a Linksys, now the Sonicwall). The only modifications to the router are a static IP, DHCP enabled, a VPN tunnel to a remote office, and several opens ports, such as RDP.

My next thoughts were to switch to NIC2 on the server or replace the switch.

Not sure where to look next or even what time the connection goes down to see if there is a conflict somewhere with a nightly service.

Thanks for your help.
Avatar of Emmanuel Adebayo
Emmanuel Adebayo
Flag of United Kingdom of Great Britain and Northern Ireland image

Hi,

Since this is occurring every night and it works throughout the day. I would advise to contact your ISP, to do some through testing of your line and settings, it is not a coincidence. The reason I suggest you contact the ISP is that it has happened to me when my ISP upgraded their infrastructure to 4G.
Also, check the router log to see if there is any task schedule to run at this time and what is If you have a logging ability in the router, I'd suggest turning it on and seeing what happens.

Regards
Avatar of notacomputergeek

ASKER

Thanks. Friday, no one was in the office and I checked the connection remotely (RDP and logmein) and it wasn't working. I got an e-mail from someone at 10:03a and they said it was working remotely (logmein to their desktop).

This morning (Sunday), I tried accessing it remotely with no luck. I tried up until shortly before 10a with no success. I tried a couple minutes after 10a and it was working. I doubt anyone was in the office rebooting the modem.

So now one theory is that there is something limiting connection from sometime at night until 10a. I'll call the ISP to see if they can tell when we lose connection at night. Since someone is usually at the office before 10a, the connection is down and they reboot the modem. If they leave it alone, maybe it would come back on at 10a? But if access is denied during certain times, then why would it start working after a modem reboot.

Another theory would be that there is possibly an inactivity setting that if there is no activity it shuts down. I've tried as late as 11:30p during the day and it's still accessible remotely.

I also tried to configure Direct Access and Remote Desktop on the server a couple months ago without success, so it's not set up. Is it possible there are some settings in these services that is causing the problem? I could see if it was problems remotely accessing the server (trying to get in), but not accessing the internet while at the office (trying to get out).
Hi notacomputergeek,

I can only bank on ISPs to do two things well: a) never admit when they are wrong; and b) never admit when they are wrong!

They are typically always going to show a "good connection". It's unfortunate but you will have to prove them otherwise. Sometimes its fully their fault other times its inadvertent due to ignorance - I've seem ISP not even know how much BW they were providing. It all depends on the provider. Personally, I can't stand AT&T.

Try an MTU change on the SonicWALL. Reducing the MTU size can help eliminate some connectivity problems occurring at the protocol level. Here is an article that explains how to get the correct MTU value: https://www.experts-exchange.com/A_12615.html

I'd do a direct-connect test too. take a laptop in toward the end of the day or at business end and plug it in directly to the modem bypassing your network. Then monitor it and see if the same event occurs. This will eliminate many possible culprits.

I setup some IP monitoring and server monitoring to see if anything trips for fair measure. Every time I have seen this type of thing occur (where everything works properly until one specific point daily) it's been due to the ISP but you have to plan/act as though it is on your side regardless to fully flush this out.

Are you exporting the logs via email or capturing them via ViewPoint or Analyzer? If so, review them. Make sure you have all Categories selected for the Logs and that they are capturing in Debug mode.
So now one theory is that there is something limiting connection from sometime at night until 10a. I'll call the ISP to see if they can tell when we lose connection at night. Since someone is usually at the office before 10a, the connection is down and they reboot the modem. If they leave it alone, maybe it would come back on at 10a? But if access is denied during certain times, then why would it start working after a modem reboot.
Typically, the ISP would need Ping enabled on the Gateway and do some sort of Ping Plotting to truly determine what is going on at least on their side of things, but if you get an L1 help desk person their tools are limited and again you can have a scenario where they say its good but clearly directly connected to it there is nothing coming through. Push this up to L2 or L3 support - those guys should have a better idea of what is going on (on their side).

Another theory would be that there is possibly an inactivity setting that if there is no activity it shuts down. I've tried as late as 11:30p during the day and it's still accessible remotely.
This wouldn't be the case...as a normal stand alone PC has a pretty good amount of traffic going out almost continually for many different occurrences. Now add Windows Server and the rest of your network and you will not see literally 0 Kbps going out.

I also tried to configure Direct Access and Remote Desktop on the server a couple months ago without success, so it's not set up. Is it possible there are some settings in these services that is causing the problem? I could see if it was problems remotely accessing the server (trying to get in), but not accessing the internet while at the office (trying to get out).
In order to setup RDP you have to port forward 3389 on the SonicWALL. Have you done that? As a Security Best Practice I would recommend not doing so. If you need remote access setup a VPN then RDP to the resource you need assess to. RDP (3389) is a widely know attack vector.
Changed MTU Size from 1500 to 1468 (0% packet loss). Other settings already allowing fragmentation.

Initial view of log shows several "Unhandled link-local or multicast IPv6 packet dropped" and occasionally a "IP Spoof blocked" or "Assigned IP address".

The last thing in the log file before it started working this morning around 10a appeared to be the site to site VPN tunnel doing something - renegotiating?  (see jpg)
SonicwallLogs.jpg
Are you using IPv6 addressing?

IP spoof log messages are caused when the SonicWALL sees an IP address on one segment that it believes belongs on another segment. For instance, an IP spoof  will be logged if the SonicWALL sees an IP address on the LAN that it believes belongs on the WAN.

IP Spoof messages are generally indicative of malicious attempts to access a network, but they can also result from bad network or VPN routes. The log message shows the packet was detected and dropped.

The following are some of the factors responsible for IP Spoof messages:

Misconfigured node on the LAN.
The most common cause of IP spoofs is a misconfigured node on the LAN. All LAN nodes must have an IP address that is in the same subnet as the SonicWALL's LAN IP address. If a SonicWALL interface is in the 192.168.168.0/24 subnet, a node with an ip of, say, 192.168.0.1 is present, the SonicWALL will drop the traffic from the node as IP Spoof.

Physical Connectivity
Another common cause would be a loop in the physical configuration of the SonicWALL and the devices connected to it. For instance, if a switch behind the SonicWALL is connected both to the X0 (LAN) and another interface (X2,X3) of the SonicWALL, it can cause IP Spoof messages if the switch does not have VLANs configured or not configured properly.

Additional LAN Subnet
Another cause of IP spoof messages is the existence of additional subnets on the LAN. In a standard setup, the SonicWALL will only recognize the subnet of its LAN IP address as being valid. If there are additional subnets connected to the LAN, in the SonicWALL you must create a route policy for those networks.

E.g. if the SonicWALL X0 (LAN) is configured in the 192.168.168.0/24 subnet and a host or hosts with IP address in 192.168.200.0/24 subnet tries to go online, the SonicWALL will drop the packet as IP Spoof.

If the network is behind a router configure Static Routes.

To configure additional subnets behind the SonicWALL without a router configure secondary subnets with static ARP which allows multiple subnets to be connected to a single physical interface.

Mutliple Network Interface Cards (NICs)
A host with multiple NICs configured with IP addresses on different subnets. One NIC (NIC A) is connected to the X0 and the other (NIC B) to a router. At times traffic meant to go out through NIC B may try to go out through the SonicWALL. When this happens it will be dropped by SonicWALL.

This could also happen over a VPN tunnel when a GVC user is connected to the SonicWALL and has a Wireless LAN (WLAN) adapter which tries to pass, more often than not, UDP port 137, 138, 139 which are Microsoft NetBIOS broadcast traffic. The workaround to this would be to temporarily disable the WLAN adapter.

Packets from additional NIC with APIPA address (169.254.x.x)
Hosts with multiple NICs could also pose problems when one of the NICs has an automatic private IP address (APIPA). These NICs could try to pass traffic through the SonicWALL with the MAC address of the adapter connected to the SonicWALL.

Workaround is to disable these adapters or ensure that a valid IP address is configured on them.

Virtual (e.g. VMware) interfaces / adapters
Nodes with Virtual Machines connected to virtual adapters with an IP address not in the same subnet as the host physical adapter may also cause IP Spoof when the virtual adapters try to access the internet through the SonicWALL. Workaround is to disable the virtual adapters or create a route policy on the SonicWALL for those networks.

Let me know how if goes!
Avatar of vo1ty
vo1ty

Lets play with Theory here for a moment

As the connection is stable during the day,and only gets lost at night,i would suggest that you also bare in mind that you might have been hacked on the router side,and it might be that in theory this could be the case,just for argument sake,check your bandwidth used for the last few days or months since this occurrence

I presume that this theory will only be justified if the ports were hacked and or you have Uncapped which in fact you wouldn't have realized and the culprit could actually clear or restart the equipment to wipe evidence

Like i said its only Theory and something that you might consider just checking out!

RDP Open ports like 3389 never a good idea

Regards
Checked this morning at 6:50a and connection was still up.

diverseit: Not using IPv6 that I know of. I had read in the past is was not a good idea to turn it off on a Windows Server 2012.

vo1ty: Virus scans are clean on the server and all computers. The router never shows that it gets rebooted and I'm not sure a virus could get to the modem and reboot it.

I will look into using another port for RDP.
OK, then follow my instruction above (http:#a39705435) to source the IP Spoof issue.

Changing the MTU looks like it has remedied this thus far.

Honestly, you're best bet is to setup a VPN and RDP thereafter. Changing ports aren't the answer as one can still sniff out traffic.
Down this morning.

It appears I can't get into the other office's router remotely either, maybe somehow the remote router/modem (other end of the VPN tunnel) is causing a problem. I could easily shut down the site-to-site VPN tunnel for a few days. The remote office is a single computer using AT&T UVerse service.
Hi notacomputergeek

My verdict will be that the modem goes to idle and drop the line because of inactivity during the night,and then is stable during the day when there is activity keeping the line from going idle..

This could be a setting that you can change on the modem!

Do you know the public ip of your modem,and can you ping it when it goes down?

Regards
The router is set to reply to pings right now, but can't ping it either.

If it's a time-out issue, why wouldn't it wake up when the first user tries to access the internet and I guess a further question would be why would this be the default on a new modem?

I am using WSUS on the server and each client is configured to look to the server for updates, so I could see why there wouldn't be computers accessing the internet at 3a.
Hmmm...working now. I'll check to see if someone rebooted the modem. I'm not at the office.
Yes, someone was in the office and rebooted the modem.
Uverse equipment is cheesy. It must be a router as modems do not offer s2s functionality. SonicWALL offers keep tunnel active. .. make sure it's configured to do so. I'd get some real business grade equipment in the other office if possible as well.
VPN tunnel showed it was up, but when I went to renegotiate, it was down. Couldn't ping remote router or get to it's interface. For now, I've disabled VPN. Will check with the remote user to see if they have internet.

Hopefully, eliminating variables will find the answer.

I don't have the model # of the AT&T UVerse modem, but does anyone know if there is a "keep alive" setting in them to look for?
diverseit: Enable keep alive has been set on the VPN tunnel.

What happened to the good old days when AT&T modems had a bridge mode?

Over the last year or so, I try to direct clients away from AT&T. There are better values in the major cites.
Did the keep alive setting help you in this problem?
What happened to the good old days when AT&T modems had a bridge mode?
hahah I agree! In efforts to secure the average, home-user neophyte ISP thought they could help the security dilemma out by providing crapper routers/quasi firewalls instead. The ISPs should still give the user the option. The cable provider where I live actually charges an additional monthly fee for such a request (having a plain old modem). #IPSs=unbelievable

How are things going with this now?
is your vpn connection setup to demand dial or is it setup for persistant,try setting it up on persistant on forcing it to redial connection every minute!

Regards
Down again this morning. Keep alive has always been on. I don't think the tunnel is causing it now that I've disabled both ends.

Yesterday morning after rebooting modem, the users said the internet was slower than normal, but got better in the afternoon.

This morning, had someone in that office test LAN and all is fine. They can get to router and server, just not internet.

I'll schedule to go to that office tomorrow morning, hope it's not working, and call AT&T. I know Level 1 will tell me reboot the modem and everything will work fine, but I'll insist they work with me before rebooting it. Hopefully, they'll have a record of when we lose connection at night.

Is there a way I can set up a recurring ping command to the WAN IP from out side the office, dump it to a file to see what time it fails.

I'm also not seeing anything in the server logs that is giving me any clues.
I'd have the modem swapped out. Force their (ATT) hand and get it to L2 or L3.

Setup monitoring - this is the answer to your recurring ping command question. Nagios (http://www.nagios.org/), Monitis (http://www.monitis.com/) or something to that affect so that you will be notified when the system starts "going down". We use Monitis and it works great. It will show you downage and latency issues all recorded from the web. It works great.

Also, I run a Packet Capture to see what is going on. You can do so remotely either on the SonicWALL appliance or from the LAN via MSFT Network Capture (http://www.microsoft.com/en-ie/download/details.aspx?id=4865).
AT&T has already swapped the modem out once. I've set up monitoring with a trial version of Monitis.
OK Good. Packet Capture should give us some more insight as to what is cause this. Monitis is great too.

Keep me posted.
Update - it was down again today and looking at Monitis, it went down about 2:02a.

I went to the office and the Broadband light on modem was flashing red and I couldn't log into it's interface with directly connected laptop. I unplugged DSL cable from modem, but did not reboot modem. Plugged it back in and could get to the GUI interface, but even that was sporadic. Disconnected router, so all I had was laptop to modem to internet and called AT&T.

The gods were shining on me as the L1 technician was actually pretty good (more like L2) and he could detect some problem with the line. He could see our history of modem reboots and dispatched a technician.

Lucky again, the on-site technician was also pretty good. He isolated a problem in the phone patch panel where there were multiple wires connecting the DSL service. He said with the old modems, they didn't have a problem with this, but the new UVerse modems are not as forgiving. He rewired the DSL connection to two punch down locations and was then getting a much better signal. He said previously, the DSL line may have been used for phone only, then DSL causing someone to run multiple wires within the patch panel. I don't know much about phone wiring, so I may be quoting him slightly wrong.

He said he's seeing more of this kind of problem, since newby techs setting up a new UVerse modem will likely plug it in, get a signal, and go away. Not really testing much or retracing the wiring.

Hooked everything back up and we're back in business. Only time will tell if this fixed it. Will give an update in the morning.

The on-site technician did say that AT&T tests all their modems at night and the testing combined with some faulty wiring likely caused multiple errors within the modem and it would shut down. He said normally the modem should reboot and be fine. Since it didn't fix itself, us rebooting it in the morning would zero out the error counters and work fine for the day. All sounds logical now, but we'll see how it works in the morning.
Great! The important thing is that we have isolated it to being an ISP issue, either being an a) wiring fault, b) service issue, or c) modem issue...removing all else from being culprits.

Thanks for the update and keep us posted!
Down again this morning and the changes the AT&T guy did yesterday took out our line 4 of the phones. :-(

It went down at 12:18a, came back up at 12:23a, went back down at 12:24a, and was down until 10a when someone rebooted the modem.

I called the AT&T tech who worked on it and he never came out today.

I can't work on it this weekend, so I'm going to physically disconnect the LAN side, so all that is up this weekend is the AT&T modem and the router (so I can ping it). That should definitely tell me if something on the LAN side may be causing it.

Very frustrating.
That's disappointing.

I'd change ISPs!

I don't know that it is necessary to disconnect the network because both you and ATT saw the modem to be the culprit. I guess it wouldn't hurt though.
It's Up this morning. I wonder now if I actually had two problems. Attached is the monitis pinging from last night. You can see it was off until about 10a, then on, then we turned the LAN off in the evening. It has worked occasionally overnight in the past though, so two more days like this should be enough.

I keep coming back to the IPv6 errors. One thing I noticed yesterday when I did an ipconfig /all was that the default gateway listed the ipv6 address (fe80...) of the server first, then the ipv4 address (192.168.1.1) of the router second. I've seen other Windows 2012  servers with ::1 first, then 192.168.1.1. I'm not too familiar with ipv6 yet, so not sure if this is ok.

I also noticed three additional tunnels when doing ipconfig /all. I remember one was 6TO4, but can't remember the others.

I believe ipv6 was needed for DirectAccess or Remote Desktop when I was trying to set it up and it may have changed some setting that didn't get undone.

I wonder if I should delete these extra tunnels and uninstall ipv6 completely.
You forgot to attach the Monitis image.

In IPv6, ::1 is the loopback address same as localhost or in IPv4 127.0.0.1.

6TO4 is an Internet transition mechanism for migrating from IPv4 to IPv6, a system that allows IPv6 packets to be transmitted over an IPv4 network (generally the IPv4 Internet) without the need to configure explicit tunnels. Special relay servers are also in place that allow 6to4 networks to communicate with native IPv6 networks.

6to4 is especially relevant during the initial phases of deployment to full, native IPv6 connectivity, since IPv6 is not required on nodes between the host and the destination. However, it is intended only as a transition mechanism and is not meant to be used permanently.

Furthermore, 6to4 does not facilitate interoperation between IPv4-only hosts and IPv6-only hosts. 6to4 is simply a transparent mechanism used as a transport layer between IPv6 nodes.

Due to the high levels of misconfigured hosts and poor performance observed, an advisory about how 6to4 should be deployed was published in August 2011. (REF http://tools.ietf.org/html/rfc6343)

First check in device manager to see whether 6to4 adapter shows up more than once or not. Are you getting any error codes next to it like code 10 or code 31?
Run the Network troubleshooter to verify that the adapter is updated.

See this: http://windows.microsoft.com/en-US/windows7/How-do-I-fix-network-adapter-problems It deals with Server 2008 R2 but discusses 6to4 if you have multiple copies.
It's up again this morning, so now I'm thinking I still have a problem on the LAN side. I've attached the monitor view I intended to attach yesterday.

I may need to bring someone in that knows more than I do about analyzing UVerse modem and Sonicwall log files, and networking to solve this.

On the LAN side, it's either a faulty main switch, a bad NIC somewhere, or the server configuration. Pretty broad possibilities. I don't think its the router, since I replaced it and it's working fine with the current minimal configuration.
monitisPing.jpg
First off setup your monitoring for two locations @ a 1min monitoring frequency with 1000 ms timeouts...set them up the alerts to trigger only when both locations fail that way you can be sure when there is a true outage and not a routing mishap...which does occur more often than you think.

Maybe you do next an Expert onsite to sort this out...but I don't understand what you are talking about here:
On the LAN side, it's either a faulty main switch, a bad NIC somewhere, or the server configuration. Pretty broad possibilities. I don't think its the router, since I replaced it and it's working fine with the current minimal configuration.
You did a direct connect test, correct and it failed! You also removed the LAN from the network and it failed! I'm I incorrect with this? These two methods both remove the firewall and the network from the equation. Additionally, a bad NIC would not be so scheduled it would be continuous. Server configuration...not sure how that could play into this...since we removed it from the equation and the failures still persisted.
Everything OK this morning and user plugged LAN back in.

Doesn't appear trial version will let me do timeouts and I normally have 3 sites monitoring.

I'm just suggesting these could be the main culprits. I removed the LAN this weekend and it performed fine. Looking back on it, when I had only the laptop connected to the modem, I did not change the DHCP&Subnet settings in the modem for no router behind modem, so the modem may not have been configured properly and the technician didn't seem to know what the settings should be. However, he did detect some problem with the line using his equipment and that's when he rewired it.

Tend to agree with you regarding a bad nic that we should see more problems, specifically on the LAN.

I have an extra main switch I'm going to put in this afternoon - can't hurt and won't take long.

It's acting like some sort of error threshold is reached at night and just shuts down the modem.

I'll dig into the modem logs tomorrow morning if it fails tonight. I looked in there last week, but I didn't see enough history.
The modem should be dumb. That being the case, I don't see anything within the network being able to shut it down.

Pay for the monitoring...it's so cheap...it will be the least cost that you end up throwing at this issue.

OK this is different than an thought regarding the testing results. I thought you had pinpointed this.

We need to get a packet capture during the issue. That is the only way to really see what is going on.

We also need to truly perform a process of elimination here, which is what any consultant you hire is going to do besides the packet capture.

Caveat: I'm assuming this is a small network given that you have a TZ 105 in there, roughly 5-15 users?

First off is the issue even known to occur over the weekend? If it is, then this weekend remove everything from the network and re-configure the modem properly so you can connect to the Internet from that laptop. Setup ping if you can in the modem. Also setup a remote session (logmein or the like) so further test it remotely. This will be our baseline. If this works without issue then we can remove the ISP service and the ISP modem (at least temporarily due to the lack of load).

Next, test is to attach the SonicWALL with a laptop or the LAN attached. Setup DHCP on the SonicWALL and test with a laptop or the like.

Finally bring in the server and do the same. Allowing enough time to transpire in order to generate an issue again.

I have had two similar issues before, both odd & both occurring on some sort of loose schedule where the Internet would go down. In one case it was a janitor plugging into a shared circuit which was overloading the circuit and then causing a power outage where the modem would not come back online correctly. The other case was never pinpointed, we spent thousands of dollars trying to source the issue - got everyone in on it  MSFT, SonicWALL, DELL all L2 & L3 engineers plus our own expert staff and some outside folks - but changing the ISP corrected the issue.
Yes, small office, about 15 users.

Yes, it occurs on weekend too.

Modem and router only devices online last weekend and never had a problem.

With everything online again, it went down last night starting at 11:47p. Up and down a couple times until 11:59p, then went dead completely until I rebooted modem this morning. When I got there this morning, I could not ping or log into the AT&T modem with a laptop connected directly to it until the reboot. All lights on front of modem were green, but it acts like it shuts down completely.

Regarding Packet Capture. On the TZ 105, since I won't be there at night, I can start it before I go to bed. What settings should be set to get it to capture all night and dump to a file?

Also, is there a way on the TZ 105 to see bandwidth usage by hour/user, etc.?
Hmmm...I have seen firmware bugs in modems cause instability where simply replacing them does nothing since the issue is reliant on the firmware version used rather than the hardware.

Regarding bandwidth usage by hour/user you can do this if you have Analyzer reporting licensed & setup (there is a 30-day trial of it if you don't have it setup). Otherwise, these are not comparable but may prove to be somewhat beneficial, you can take a look System > Diagnostics > Diagnostic Tool: select User Monitor & Web Server Monitor and set both for last 30 days. You can also take a look at Log > Reports > Report View: select Bandwidth Usage by IP Address. Note for this to work you must click on Start Data Collection previously otherwise it sill start capture point-in-time of enabling it. Also, the other tools in Diagnostics and the Reports I mentioned above are more of a point-in-time or real-time trackers and therefore are contingent on the SonicWALL be up. In other words when you reboot the SonicWALL much of this data will be trashed as well. The Analyzer is really the core and central focus for true real-time and archival reporting.

You'd have to specify an FTP server if you want this automated so the logs would dump into that otherwise you'd have to manually export them as (here are your options): Libpcap, HTML, Text, or App Data.

Here's how to configure the FTP server:
These settings provide a way to configure automatic logging of the capture buffer to an external FTP server. When the buffer fills up, the packets are transferred to the FTP server. The capture continues without interruption.

NOTE: If you configure automatic logging, this supersedes the setting for wrapping the buffer when full.

With automatic FTP logging, the capture buffer is effectively wrapped when full, but you also retain all the data rather than overwriting it each time the buffer wraps.

Here's how:

1. Login to the SonicWALL Management GUI.
2. Navigate to the System > Packet Capture page.
3. Under Packet Capture, click Configure.
4. In the Packet Capture Configuration window, click the Logging tab.
5. In the FTP Server IP Address box, type the IP address of the FTP server. For example, type 192.168.168.2. Make sure that the FTP server IP address is reachable by the SonicWALL appliance. An IP address that is reachable only via a VPN tunnel is not supported.
6. In the Login ID box, type the login name that the SonicWALL appliance should use to connect to the FTP server.
7. In the Password box, type the password that the SonicWALL appliance should use to connect to the FTP server.
8. In the Directory Path box, type the directory location for the transferred files. The files are written to this location relative to the default FTP root directory. For eg. if the root directory of the FTP server is "FTP" and a sub-folder named "Capture Files" is created, enter "Capture Files" in the Directory Path.
9. To enable automatic transfer of the capture file to the FTP server when the buffer is full, select the Log To FTP Server Automatically checkbox. Files are transferred in both libcap and HTML format.
10. To enable transfer of the file in HTML format as well as libcap format, select the Log HTML File Along With .cap File (FTP).
User generated imageFor libcap format, files are named “packet-log--<>.cap”, where the <> contains a run number and date including hour, month, day, and year. For example, packet-log--3-22-08292006.cap. For HTML format, file names are in the form: “packet-log_h-<>.html”. An example of an HTML file name is: packet-log_h-3-22-08292006.html.User generated image

Here's how to Test:

When the settings are saved, start a packet capture by clicking on the Start or Start Capture button. If the FTP settings are correct you will see a green light under FTP Logging active.User generated image
I'll see if I can set it up tonight. I did some digging on the internet and a lot of people have problems with these NVG510 modems, but no concrete solutions. I've seen several nighttime problems suggesting the SN Margin gets too low (say less than 6) and it's possibly caused by some sort of interference or if you're at the end of the "loop". It's suggested to have AT&T throttle back the speed from 12 to 6 to see if it makes a positive difference.

However, this doesn't seem to explain why running all last weekend with just the modem/router saw no problems.
Went down first at 11:23p last night, then for good at 11:26p. I have the packets captured, but not sure what to look for. Any suggestions? Maybe what's in red?

SN Margin this morning after rebooting modem is7.1 Down and 8.6 Up.
Line Attenuation is 19.5 Down and 12.3 Up.
Output power is 18.6 Down and 12.1 Up.
Only errors thus far are 13 FEC Errors Down.

Thanks.
I've looked through the packets and don't see an obvious issue. What's odd to me is that the packets before the outage look like the packets after the outage, even at 7:45a when I stopped the capture. So it looks like packets are flowing in/out of X0/X1, but I can't get to the modem interface or ping the modem from my directly connected laptop during the outage until I reboot the modem.

I see a lot of traffic to/from Microsoft, since they use Office 365 w/Outlook. All IPs I checked seem to be normal activity.

There are a few errors that keep popping up:

"I don't have a clue what would be at 192.168.1.255 and it happens with several 192.168.1.x IPs"

Header Values:
 Bytes captured: 253, Actual Bytes on the wire: 253
Packet Info(Time:12/17/2013 23:13:01.768):
 in:X0*(interface), out:--, DROPPED, Drop Code: 51(Broadcast traffic not handled.), Module Id: 25(network), (Ref.Id: _7048_iboemfCspbedbtuQbdlfu), 0:0)
Ethernet Header
 Ether Type: IP(0x800), Src=[00:0d:60:ef:de:e7], Dst=[ff:ff:ff:ff:ff:ff]
IP Packet Header
 IP Type: UDP(0x11), Src=[192.168.1.142], Dst=[192.168.1.255]
UDP Packet Header
 Src=[138], Dst=[138], Checksum=0x80ff, Message Length=219 bytes
Application Header
 NETBIOS SMB:
Value:[1]

"192.168.1.200 is our phone system and this occurs with 3 different Dst IPs"

Header Values:
 Bytes captured: 148, Actual Bytes on the wire: 148
Packet Info(Time:12/17/2013 23:13:02.912):
 in:X0*(interface), out:--, DROPPED, Drop Code: 52(Multicast forwarding not configured), Module Id: 25(network), (Ref.Id: _7073_iboemfNvmujdbtuQbdlfu), 0:0)
Ethernet Header
 Ether Type: IP(0x800), Src=[b4:0e:dc:b5:97:9c], Dst=[01:00:5e:14:13:32]
IP Packet Header
 IP Type: UDP(0x11), Src=[192.168.1.200], Dst=[239.20.19.50]
UDP Packet Header
 Src=[5588], Dst=[6254], Checksum=0xa734, Message Length=114 bytes
Application Header
 Not Known:
Value:[0]

I didn't check every line in every file, but also saw at least one occasion that Src=192.168.1.105 tried to access Dst=255.255.255.255, so not sure what that is either.

The only ARP drops I saw were the laptop directly connected to the modem, because it was not considered on the same subnet.

So, why would things look like they were working in the router logs, but the modem is "down"?

I'm also curious why my constant pings from outside to the WAN IP not show at all in the router logs?
The other thing I've read about these modems is DNS issues. Just curious why AT&T gave us 68.94.156.1 and 68.94.157.1 for DNS1 and DNS2, but the modem connects to 99.99.99.x?

I beginning to wonder if AT&T can supply a different modem to businesses that don't need the phone portion?
ASKER CERTIFIED SOLUTION
Avatar of Blue Street Tech
Blue Street Tech
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Because I've read so many problems with the NVG510, I wonder if AT&T has a different model I could try. Yes, the phone jack in the back of the modem - we don't use it. These devices are used for home/SMB where they may run their phone through it as well.
There is no "bridge mode" check box like the old dsl modems, so you have to configure it through the DHCP&Subnets menu on the modem.
Turned server off last night. Internet still went down at 11:37p. Interestingly, it showed blips of success during the night. I'll post chart soon.

I'm going to reboot the modem after work today to see if it stays on all night or still goes down about the same times as before. Is it time of day or duration?
Here's the pings from last night. I've only shown one location here, but two locations give same results.
monitis.png
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I'm still having some issues, but not as extreme after implementing the "bridge mode" changes suggested in these articles. I already changed some of these settings according to AT&T phone support, but made several others after reading these:
http://www.ron-berman.com/2011/11/24/motorola-nvg510-help-page-for-att-u-verse-users/ (see question 6)
http://broadband.custhelp.com/app/answers/detail/a_id/21979/kw/nvg510%20bridge%20mode/session/L3RpbWUvMTMzNzYyMjQyOS9zaWQvZk1pRSpHWWs%3D

Currently, the disruption is still mostly at night, but not every night. Also, after these changes, it seems to always come right back on. Before, it would go off and the modem would have to be rebooted to regain service.

The 30 Day attachment shows a difference after I made changes to the modem on the 20th. The dip on the 13th and 14th is when I disconnected everything except the modem and router.

It's unlikely we'll change ISPs, since of the four main carriers in the area, two of them would cost too much to run service to this building and the other one I don't recommend.

I'll be closing this question by tomorrow.
Last30DayPing.jpg
Here's the last 24 hours. See how it goes up and down during the night, then it's ok without anyone touching anything.
Last24HourPing.jpg
Still some issues going on, but I appreciate all the help. It's definitely better after changing multiple settings. Next step may be to see if I can use a different modem model or call someone in to assist.