• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3089
  • Last Modified:

SonicWALL NSA 2400 X0 interface stops responding to LAN but can be reached from the outside.

So I have a Sonicwall NSA 2400 at the edge of my network. At no consistent time, the entire LAN that's run through X0 goes completely dead and we can't even ping the gateway from the inside. The closest I can get is the switch before the firewall (x.x.1.250) and everything else on the local network. This outage kills all access from the outside world, cutting off websites we host, Exchange, security systems, Citrix, and RDP. Up until a couple days ago, it would go out, but then restore itself after a few minutes. But now, our third-party help desk actually has to restart the SonicWALL just so we regain access...temporarily, Currently, it's kicked me out and it's not coming back up on its own.

From what they're telling us, even though none of our services are reachable, they are still able to reach our SonicWALL from the outside (VPNs survive, but nothing else?). Furthermore, apparently they can even hit our internal gateway, but not the first switch right after the firewall (x.x.1.250). Yet, I still can't log into anything hosted internally nor can I log into my office PC through TeamViewer.

They're suggesting that this is loop behavior on the switch's behalf, but I didn't rewire anything in the server room. Although, I just started working here and these guys haven't had an IT team in 15 years or so, so who knows what they have plugged into what. How would I even confirm that/locate the loop? The firewall logs show no kind of link failure so failover never takes place.

I think the sonicwall's firmware is currently 5.8.1.13-?? I can't find out because I can't get into the network anymore...just the third-party help desk. Wireshark on my (x.x.1.250) hasn't given me much to work off of either expect showing me a bunch of DNS requests falling flat when the outage occurs. Once I regain access to the firewall, logs don't show anything, the connection monitor doesn't show anything abnormal, nor the core monitors.

I'm so confused, it's ridiculous.

Is it the switches? the firewall?

Thanks,
0
Shawn Mooney
Asked:
Shawn Mooney
  • 10
  • 6
  • 5
1 Solution
 
gheistCommented:
As I first effort I'd set up syslog from firewall (Windows product is called kiwi) and capture agony messages and at least messages from directly connected switches and UPS-es
Could it be you have IP conflict, e.g. somebody brings from home a PC with same dhcp IP as your gatway? You need arpwatch to record that.

It gets complicated... Maybe you get some all-enclosing monitoring system like nagios?
0
 
carlmdCommented:
I would try some simple things first. Either replace the first switch or change the port that the Sonicwall is plugged into. Be sure to power the switch off/on if you change the port.

When the lan is dead, plug a laptop or pc directly into the lan port of the Sonicwall, and see if it is alive that way. Then we know for sure if it is the Sonicwall of something after it on the lan.
0
 
gheistCommented:
Check which logs are easiest to collect. Maybe there is bad cabling to firewall or something like that.
0
Put Machine Learning to Work--Protect Your Clients

Machine learning means Smarter Cybersecurity™ Solutions.
As technology continues to advance, managing and analyzing massive data sets just can’t be accomplished by humans alone. It requires huge amounts of memory and storage, as well as the high-speed power of the cloud.

 
Shawn MooneyAuthor Commented:
Thanks for the reply everyone! So I took gheist's advice and set up that Syslog system with Kiwi (placing the server on my workstation). So some interesting things became of it. I used the notification emails I get from SCOM that our FL office servers failed heartbeats and when our gateway stops responding to pings to give me a range of time to look in. So we had a couple of outages yesterday...first strange thing I noticed is that even thought I wasn't able to ping the firewall anymore, my computer was still getting the Syslog transmission. Another thing I noticed is that a few minutes before every outage, one of our DNS servers end up with a lot of "Closed Connection"s. Once that happens though, addresses from the WAN (X1) and Client Network (X4) are still able to open connections with some of our servers, but nothing on the inside (X0) from a client standpoint is making any connections going out to the WAN, however, the servers appear to be making connections going out. I'm assuming if something's traveling within two X0 nodes (like client to server) doesn't ever cross the firewall since none of that gets logged whether in an outage or not.
0
 
gheistCommented:
Smells firewall's state table fills up and it does not allow new connections. I dont know translation into SonicWalls terms, but there should be option to rise network memory/ number of connections, or at least to reduce idle connection timeouts (proceed very carefully with later, it closes database connections among others)
0
 
carlmdCommented:
0
 
Shawn MooneyAuthor Commented:
Hey everyone. Thanks again for the responses. Alright so after letting Syslog server go for a while, the only thing I'm getting from that is huge strings of "Connection Closed" when these outages occur, but nothing that's telling me why since nothing seems to be consistent (i.e. the computers that are getting dropped). Then comes a persistent window of about 3.5 minutes where nothing from X0 is making any new connections. carlmd, I looked into the solution proposed in your link, however, after running and ARP capture filter, the only requests that are getting dropped are actually all egress (externally bound?) It seems to keep good track of Verizon's router with a static entry. I even added one for our core switch in X0 this morning just to be extra about it all. It doesn't seem to drop any ARP packets from the outside (ingress?). Maybe I'm misunderstanding the information...idk. Right now I have local mirroring into X5 that's being run into another computer with WireShark running. I'm hoping if/when there's another outage, patiently combing through that will reveal something. I'm beginning to wonder if the fact that my two hosted LANs are running 1000 Mbps (auto-neg on SW, but hard-coded on switch) and the WAN (FIOS) is auto-negotiated at only 100 Mbps has something to do with it. Is it possible that it's bottlenecking to the point that SonicWALL just shuts off X0 until it catches up? Our client LAN (X3) doesn't see nearly as much traffic as X0 and, in turn, is still able to reach out to the internet somehow. After all, this is all speculation. I just can't seem to figure this out... Last recorded disconnects for today alone are at 12:04, 1:05, 4:00, 4:06 and 8:01 AM. None so far since the 8:30 start-time and the only thing I did was start that mirror and add that ARP entry for that first switch in X0. We do have an M86 Web Filtering device that practically kills all surfing during most hours of the day which is probably why we dont experience as much outages? Not sure.
0
 
carlmdCommented:
Since the cause of the problem is not apparent, I would try lowering the Sonicwall LAN port and switch to 100MB, both hard coded. Can't imagine why this would be the issue, but it never hurts to rule it out.
0
 
Shawn MooneyAuthor Commented:
So I've tried hardcoding the link speed and duplex on the SonicWall and still suffered multiple outages all throughout the weekend and even once today. It's pretty frustrating because the SonicWall log doesn't say anything consistent per outage. I'm still able to use my laptop to reach out to the Internet via X4 (Client LAN), remote into my PC at home, and log into my own firewall externally, but still can't get through the X0 interface. It's like it just turns itself off for 3-5 minutes and doesn't even tell you why. Wireshark is just waaay too much information and none of it seems obviously irregular. I was thinking it's the ISP and that whole ARP issue, but then why am I able to get in from the outside then? I just have no idea at this point. I feel like I've tried everything except take out the whole firewall all together...and that's not even an option.
0
 
carlmdCommented:
Lets back up a minute. Did this ever work successfully with this Sonicwall? If so, any idea what might have changed.

Did you try replacing the first switch on the lan that connects to the Sonicwall?

Did you try plugging a laptop directly into the Sonicwall lan port when the network is failing, to see if that works?
0
 
Shawn MooneyAuthor Commented:
Hey carlmd. I'm waiting on the new management switch to come in now and I've got a laptop downstairs right now waiting for the next outage. I'm going to try to plug it directly to X0 as soon as I see my system tray icons start to freak out. I was actually looking into the switch's settings this morning, and in the middle of changing something, it completely cut me out. I had to actually go into the server room and unplug it and reboot it just to get it visible again. Once that failure occurred, I was getting the exact same emails I would during any outage. With that said, I'm hoping it's just the root switch acting up. I noticed whoever set the switch up tied its IP address to a VLAN interface vs just Port G1. Attempting to switch it back is where everything locked up. I'm not sure if that had anything to do with it's regular outages, but that'll be one less thing. There were also some "received errors" on port g1 one as well, but I don't know what they were about. The counters have been reset post-reboot though. Thanks again.
0
 
Shawn MooneyAuthor Commented:
So we had another outage today and after plugging up a laptop directly into X0, it showed me that it still won't respond to pings until a few minutes have passed. Which leads me to believe, something inside that firewall is going wrong, correct?
0
 
gheistCommented:
Correct. Tell it to sonicwall support
0
 
carlmdCommented:
I wouldn't have much faith that Sonciwall support is going to solve this problem for you.  To complicated or confusing for them.

Just to clarify, when you plugged the laptop directly in X0, you say it won't respond to pings. Do you mean pinging out from the laptop to something on the internet, or incoming.

Sorry to ask again, but did this ever work correctly?
0
 
Shawn MooneyAuthor Commented:
@carlmd: I meant pinging the default gateway/IP address bound to that interface.

Update: We upgraded the firmware to 5.8.1.15-71o and we're still experiencing these issues, but now it takes on a different behavior. Instead of it happening and random points throughout the day, now it's only happening after hours and sometimes once per lunch shift. The times don't line up with our backup schedule at all, however, they are starting to correlate with our "Time Profiles" I set up within our M86 Web Filter. It's set up to pretty much block anything on certain user profiles between 830-500PM with a half hour window during lunch (1230-100PM). Perhaps this is a multi-issue scenario, but nonetheless, X0 still cuts off on us.

We were able to open a case with Dell and they're currently awaiting packet captures that would be initialized right before or during an outage. Other than that, still investigating..
0
 
Shawn MooneyAuthor Commented:
Update: We went even further into firmware releases and installed 5.9.0.4-127o. We haven't had any random disconnects over the weekend so we'll see how this all pans out throughout the week.
0
 
carlmdCommented:
Did Dell suggest this as a solution?
0
 
Shawn MooneyAuthor Commented:
Yes, they're the ones that suggested the 5.9.0.4-127o firmware release. Apparently, 5.8.1.15-71o release wasn't addressing our issues fully. The firmware has been in place since 7/25 and X0 hasn't gone down thus far. If nothing falls face first by the end of the week, I'll deem that particular release as the solution.
0
 
gheistCommented:
I am very happy you found common language with DELL support.
0
 
Shawn MooneyAuthor Commented:
Update: So far so good. It's been a solid week since upgrading our firmware to 5.9.0.4-127o on our SonicWall NSA 2400 and there hasn't been any disconnects. The X0 LAN interface seems to be holding up just fine. All the remote Citrix users, as well as internal internet users are happy now that they aren't losing their work spontaneously.
0
 
Shawn MooneyAuthor Commented:
Disconnects sourcing from the X0 interface have ceased since the firmware update.
0

Featured Post

Rewarding opportunities for women in IT

Across the nation, technology jobs are vacant because there aren’t enough qualified professionals to fill them. With a degree from WGU, you can get the credentials it takes to become an in-demand IT professional. Plus, WGU’s IT programs include industry certifications.

  • 10
  • 6
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now