ARP Storm Taking Down Default Gateway

We have been experiencing a problem in our local network where the default gateway is being taken down due to what appears to be an ARP storm.

Originally, the default gateway was set to a Cisco 2851 router that routed traffic between several VLANs and had its default route set to our Cisco ASA router. Both the 2951 and the ASA were connected to a Cisco 2560 switch.

When the outage occurred, we lost all routing from the 2851 although we could still access it via Telnet. Clearing the ARP cache would instantly bring all functionality back. We saw a large amount of ARP requests coming in (thousands per minute) and the routing would go back down within about 15 minutes.

To test, we changed the default gateway (set by DHCP) to the ASA router. We experienced the same behaviour of ARP traffic and it would take down the internal interface of the ASA. Clearing ARP instantly brought all functionality back.

We also tried setting up a temporary internet gateway using a Cradlepoint router hooked to a Verizon aircard. It was connected through an intermediant HP switch that was connected to the 2560 switch. After an hour or so, the Cradlepoint was overwhelemed and also went down.

A little more information: We experienced this behaviour two days in a row. Communication inside the same subnets worked fine. Routing would go down around 9:30 AM each day and everything would settle down and become stable around 4:30 pm.

We think the problem is originating from a laptop and only starts happening when the employee arrives to work and then it stops when the employee leaves with their laptop.

Is there any other likely cause to this problem? If it is a laptop, what is the best way to handle this problem? We can wait until it starts happening again on Monday and disconnect switches and ports until we identify the culprit. However, I'd like to prevent any more downtime.

Thanks in advance.
HunterITAsked:
Who is Participating?
 
mcsweenConnect With a Mentor Sr. Network AdministratorCommented:
I would enable storm control on all access ports on all switches.  When the culprit starts going crazy the switch should shut down the port.  The user should either call with a complaint they lost connectivity or you can look on the switches to see which port has been shut down (show interface status)

To enable storm control add the following to every access interface, you can use the int range command to do multiple at once.  You do not want to set this on your trunks.
storm-control broadcast level 20.00 5.00
storm-control multicast level 50.00 30.00
storm-control action shutdown

Open in new window

More information on storm control here
http://www.cisco.com/en/US/docs/switches/lan/catalyst2950/software/release/12.1_22ea/SCG/swtrafc.html#wp1229873
0
 
rauenpcCommented:
Dynamic arp inspection combined with dhcp snooping will mitigate this scenario, although it can be a lot of work to implement depending on you environment. Bottom line is that you need to find the source of the arps and remove it. I would guess a faulty nic or more likely a virus.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.