?
Solved

Intel Switch "Out of Pools" error

Posted on 2006-05-01
8
Medium Priority
?
300 Views
Last Modified: 2008-02-01
Hi,

We have some ancient Intel 510T switches that used to work great but are now having some problems.  The whole switch will randomly started dropping packets, and will report "Receive Discards - Out of Pools" error on only one particular port at the same time (it is one of the busiest port because the load balancer is plugged in there).  Looking at the switch's manual, it says these errors "shows there are no memory pools left because there are so many frames stored."  And then the next line says: "Significance: The switch tries to cause collisions to increase the number of frames rejected; this gives the pools time to empty."

Since the timing of the packet loss corresponds to our load balancer log (about it getting disconnected from the firewall, likely the result of packet loss) as well as the switch increasing the count of "Out of Pools Receive Discards", I presume they are related... However, what does that error really means, what may have caused it, and how do I fix it?  Does it means I am reaching the limit of this switch and I will have to upgrade?  Would changing the switch mode from "cut-through" to "store and forward" help the issue?  I didn't know if "out of pools" is a standard networking-speak that have common solutions out there... especially the manual's explaination sounded kind'a vague to me?  Any advise will be greatly appreciated.

By the way that port in-question normally only puts out ~120 total packets per second, and when the error occurs the swich is usually not under load.  We have load-tested it to about 1800-2200 packets per second doing several large downloads at the same time, so it would seems to me that the problem is not throughput related... I mean, 120 packets per second should be tofu for these switches, no?

Thx.



Wallace
0
Comment
Question by:WallaceLau
  • 3
  • 2
  • 2
7 Comments
 
LVL 27

Expert Comment

by:pseudocyber
ID: 16582354
Any chance of being able to call support on the switch?  What about known bugs and/or firmware upgrades?

Sounds like it might be some kind of DoS attack or something.  How about throwing a sniffer/packet capture on it to see what the traffic is?

From the sound of it, it sounds like store and forward actually might make things worse.  Although, I suppose it's something to try.

I thought I heard somewhere that Intel actually didn't make switches - that they're rebranded something else.  I'm not sure what they might be.  
0
 
LVL 18

Expert Comment

by:Sam Panwar
ID: 16582362
Hi,

The 510T switch has a internal bandwidth of 800MB. This means if you load the switch with 15x100MB then the internal bus will be overloaded (more bandwidth is forwarded into the switch, than the switch can handel).Out of pool counter will only count if e.g. you transmitting 100MB to a 10 MB destination port. Here will the internal bus not be overloaded but the
pool buffer will be filled. Notice: this is only happening if flow control is disable.

http://www.intel.com/support/express/switches/10/sb/cs-014375.htm

customer support
see http://support.intel.com/support/9089.htm 
0
 

Author Comment

by:WallaceLau
ID: 16584649
Thanks for the quick responses.

pseudocyber:  We have not ruled out DoS either, actually on April 2nd we were under a fairly massive DDoS atack and our co-lo facility turned off ICMP on all of our IP addresses.  The packet drop issue started happening about a week or two after the attack.  Also I agree that store-and-forward might be worse as it will likely require more memory (since "Out of Pools" sounds like out of memory errors).  However, the switch is so old that I don't even know if Intel still supports it.  I do know the latest firmware was released like years ago (they all have the latest firmware already).  Regarding sniffers, I don't even know if the switch supports forwarding all traffic to a specific port to be sniffed... at least I havn't found out how yet.  Since we won't install any software (including sniffers) on production equipments, I can't sniff locally on the load balancer either.  The only alternative is to schedule a down-time, unplug something, and route it through a dumb network hub so I can plug a sniffer in there...


Abs_jaipur,

Thanks for the technical notes and Intel support phone numbers, if I can't resolve it here I will give them a call (again not sure if they still support it).  Regarding bandwidth, the switch is used to only serve web traffic and we only pay for 1mbps (one mbps) of bandwidth (although it is burstable to 100mbps).  Since all machine plugged into it are servers which does not generate traffic by themselves (all traffic are incoming request), I don't think it is bandwidth related.  As I said the packet count during the "packet loss" error period is only about 130 packets per second on the questionable port, and the rest of the ports have even less traffic.  The switch is also not showing oversized packets on its error counter.  Oh also flow control is enabled on all ports, and they are are negotiated at 100mbps full-duplex.  Strange?

At this point I think the question is, can some kind of DoS attack cause the switch to behave like that?

Thx.



Wallace
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
LVL 18

Assisted Solution

by:Sam Panwar
Sam Panwar earned 500 total points
ID: 16592239
Hi WallaceLau,

YEs may be Dos attack cause your problem and the better way to contact to Intel because intel support system is very nice and they can easily understand your problem.

H ave a nice day.
0
 

Author Comment

by:WallaceLau
ID: 16596451
Turns out it may have very little to do with DoS... I brought in a little hub and moved all the public connections off the switch, so there is now nothing plugged into VLAN #1 (public side).  VLAN#2 remain unchanged (Web DMZ).  The switch is still going nuts once in a while reporting the "Out of Pool" error, when it is only serving internal traffic.  So the theory of someone sending a mulformed or oversized packet to mess with the switch no longer apply.  (Unless those packets got through the firewall and still got forwarded to the switch... I can't imagine the PIX didn't catch it though).

I guess I needed to start sniffing the port that is reporting error...



W.
0
 
LVL 27

Accepted Solution

by:
pseudocyber earned 500 total points
ID: 16597350
>>I guess I needed to start sniffing the port that is reporting error...

Sounds like you need to start getting quotes on a new switch ... :)
0
 

Author Comment

by:WallaceLau
ID: 16809810
We still have no idea of what is going on, but the problem seems to have gone away after we replaced the intel swithc with Cisco 2924XL.  Oh well.



W.
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article explains the fundamentals of industrial networking which ultimately is the backbone network which is providing communications for process devices like robots and other not so interesting stuff.
In this article, we’ll look at how to deploy ProxySQL.
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

621 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question