Link to home
Start Free TrialLog in
Avatar of fallonsupport
fallonsupport

asked on

ARP Flood Problem

I have customers using PPPOE to connect to our network. For the past few days there have been instances where they are unable to contact the PPPOE server (Redback SMS 1800) and or get very slow performance all around. Ping times to a local box are over 1400ms. Our distribution router which these customers route through had originally been peeked at 98 percent cpu usage. After applying filters to filter out port 135 and icmp etc (new viruses) the cpu usage is at an acceptable level.

The PPPOE customers computers appear to be getting a IP address in the 169.254.0.0 range (which they would until they connect with their Winpoet software) What I am seeing with a sniffer is an IP address say for instance 169.254.25.155 scanning with ARP requests sequentially through the entire 169.254.0.0/16 network space. After they get through the entire block I don't see anymore ARP requests from the IP. I am suspecting that there may be a vicious circle. Our vendor's engineers believes we have a broadcast problem in our DSL Superline DSLAMS, and that is the reason the machines cannot contact the Redback for a valid IP address. Is it because of this ARP traffic?

When this is going on ARP percentage of total traffic is showing around 50 to 75 percent otherwise it's nominal.

Any ideas?
ASKER CERTIFIED SOLUTION
Avatar of Les Moore
Les Moore
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of fallonsupport
fallonsupport

ASKER

I've already performed access-list methods to combat those virus. That was what I to initially thought the problem was. One one router in 3 days we saw 350 million matches to the icmp block. Both our edge and distribution routers are performing at an acceptable level far as load etc. There are other customers that connect via ADSL and other flavors of DSL riding the same infracture in the network upstream of the particular equipment having problems that don't have any issues so it doesn't appear to be a n issue with a network wide problem.
>One one router in 3 days we saw 350 million matches to the icmp block.
That is proof positive that there are systems infected with these worms. Trust me on that one.
Identify the offending IP addresses one-by-one and get them cleaned...

We know that there are machines having virus problems. We are getting them cleaned as we can contact customers (ISP operation here) But we have effectively blocked the effects from the ping problems via access-lists. Router cpu usage one one router is averaging 35 percent and the other 50 to 60 percent. Some customers have no problems with connections and can access the internet with no problems speeds are outstanding. Only customers operating from this particular equipment (Lucent Superline DSLAMS) are having problems. We have seen via sniffer what I mentioned above a machine will have a 169.254.x.x address and that machine is arping the entire 169.254.0.0/16 network. This produces a large amount of percentage wise arp traffic. I can turn the customers circuit off and the arps from that machine naturally stop.

So on a network wide basis there isn't a congestion problem in my opinion. It's just from this particular boxes. The only evidence I see of something amiss is the above ARP behavior.
Any machine getting an address in this range: 169.254.x.x  is using Microsoft's default APIPA range which means they are not getting an IP address via DHCP. Where is their DHCP server in relation to the users?

http://support.microsoft.com/support/kb/articles/Q220/8/74.ASP&NoWebContent=1
They receive IP from the Redback SMS 1800. They connect with software called Winpoet and receive a pool address from the SMS.  They traverse a switch that is directly connected to a fast ethernet port on the Redback.


DSLAM ------- Switch -------- SMS 1800 ------- Default route out second fast port ------- 3640 router ------7206 router -------Internet

lrmoore

The problem may be a offshoot of the blaster virus. We believe that infected machines may be spending so much effort trying to find other machines to infect they are unable to contact the SMS for an valid IP address. We went to a customers premise this morning and they did in fact have the virus. Seem plausible?
Absolutely plausible. The denial of service effects of the virus are keeping those workstations from getting IP addresses.
What kind of switch? Is it a Cisco? Is portfast enabled?
The switch is a HP2512. We are working up a cd with a system cleaner up and no brainer operation and advising those that cannot connect to come get it and clean up their system. Let you know how things come out but I believe you have the answer.
I have had a lot of port scans going through the full range, did an WhoIs and got http://www.akamai.com/ according to the web site quote "Akamai tracks worldwide spread of Blaster worm on the Internet" yet if the scans are coming from them they are causing me more problems than the virus !
Are you still working on this? Can you close out this question?

Thanks!