I have a client with an HQ and 4 remote sites (Branch VPN) that is experiencing a seemingly random inability to route past the gateway.
HQ: SonicWALL TZ210 Network: 192.168.0.0/23
Site1: SonicWALL TZ215 Network: 192.168.56.0/24
Site2: SonicWALL TZ215 Network: 192.168.75.0/24
Site3: SonicWALL TZ215 Network: 192.168.100.0/24
Site4: SonicWALL TZ215 Network: 192.168.125.0/24
The symptoms of the problem are:
Seemingly random systems in HQ in the upper range of IP's (192.168.1.0-253) suddenly and intermittently cannot route past the gateway (192.168.1.254).
No systems in the lower range of the HQ network are affected (192.168.0.1-254).
No systems on any of the VPN's are affected.
What I've observed is the problem always exists but the impact/severity of the problem is greatly increased by one particular VPN (Site2) being enabled. When that particular VPN is up within a couple of minutes several systems stop being able to route past the gateway. When it's down fewer systems are affected and usually only for a few minutes at a time.
What makes it stranger is it's not always the same systems affected but always within the same IP range. If it's 192.168.1.x it is open season. Anything in the 192.168.0.X range on the LAN is not affected.
I should also add that we have 3 SonicPoint's also deployed at HQ and none of them are affected. I also ruled out a switch problem by enabling an interface on the SonicWALL and plugging a 'problem' system into it directly.
I suspect one of two things or a combination thereof:
1. An IP range conflict with Site2 - even though Site2 is now 192.168.75.0/24 it was previously (before the VPN) a 192.168.1.0/24 network. I have never been to this site physically but the users claim there are 'a lot of boxes with blinky lights' connected to the LAN. I'm wondering if having NETBIOS enabled for the VPN's is somehow generating this conflict and the gradual up and down (rolling outages) are from dynamically updating routing tables on the SonicWALL at HQ. I know this site has a wireless access point of some kind (could be a foreign router) and it could be hooked up to the same LAN.
This only issue with this scenario is that while the problem is drastically reduced when this VPN is disabled it's still present. One thought was that the other VPN's could be contributing to the issue (the same idea with foreign network devices connected) just not to the same extent as this particular endpoint.
2. The SonicWALL TZ210 has memory/cpu issues causing corruption in the routing tables
The issue I have with this is that the remote endpoints apparently do not have an impact are actually larger networks with more devices connected. Memory/CPU and connection counts would conceivably be more affected by turning these VPN's off and on but they're not.
Anyway, I'm getting to the point of a factory reset and recreating the entire configuration but I want to avoid that at all costs because it would signify downtime (and a considerable chunk of my time) with no guarantees of success.
Anyone have any ideas?