Link to home
Start Free TrialLog in
Avatar of tkalchev
tkalchevFlag for Germany

asked on

Routing problem

Hi everybody,

I have the following network configuration :

1. Cisco router, which is exporting for me the following networks :
212.95.120.16/255.255.255.240 over 212.95.120.3
212.95.120.192/255.255.255.192 over 212.95.120.4
212.95.120.32/255.255.255.224 over 212.95.120.5
The IP address of the router is 212.95.120.1, and the network is 212.95.120.0/255.255.255.240

I should protect these 3 networks, so I separated each of them in a subnet as follows :
212.95.120.16/255.255.255.240 -> 212.95.120.16/255.255.255.248
212.95.120.192/255.255.255.192 ->
212.95.120.192/255.255.255.224
212.95.120.32/255.255.255.224 ->
212.95.120.32/255.255.255.240
2. A RedHat 7.2 firewall with 6 NICs (I am using 4 of them for the routing) :
eth1 - 212.95.120.5 with netmask 255.255.255.240
eth1:0 - 212.95.120.3 with netmask 255.255.255.240
eth1:1 - 212.95.120.4 with netmask 255.255.255.240
eth3 - 212.95.120.17/255.255.255.248
eth4 - 212.95.120.193/255.255.255.224
eth5 - 212.95.120.33/255.255.255.240
So, eth1, eth1:0 and eth1:1 are connected to the Cisco router and are hooking all the requests to the exported networks.
eth3 should be a gateway for the first network, eth4 - for the second and eth5 for the third.
In the moment I have 2 Database Servers, which are connected to the first network and have IP addresses 212.95.120.18 and 212.95.120.19 with netmask 255.255.255.248.
The routing is working strange: very slow normally, and if I ping some of the servers it becomes normal. What is my mistake ?
Avatar of tkalchev
tkalchev
Flag of Germany image

ASKER

Here is the output or the routing table in the firewall for now only 1st network is active) :

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
212.95.120.16   *               255.255.255.248 U     0      0        0 eth3
212.95.120.16   212.95.120.17   255.255.255.248 UG    0      0        0 eth3
192.168.1.0     *               255.255.255.240 U     0      0        0 eth0
212.95.120.0    *               255.255.255.240 U     0      0        0 eth1
212.95.120.32   *               255.255.255.240 U     0      0        0 eth5
212.95.120.192  212.95.120.193  255.255.255.224 UG    0      0        0 eth4
212.95.120.192  *               255.255.255.224 U     0      0        0 eth4
192.168.0.0     *               255.255.255.0   U     0      0        0 eth2
127.0.0.0       *               255.0.0.0       U     0      0        0 lo
default         212.95.120.1    0.0.0.0         UG    0      0        0 eth1
And this is my firewall settings :

:forward ACCEPT
:output ACCEPT

-A input -s 0/0 -d 0/0 10000 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 0/0 -i lo -j ACCEPT
-A input -s 0/0 -d 0/0 -i eth2 -j ACCEPT
-A input -s 141.1.1.12 53 -d 0/0 -p udp -j ACCEPT
-A input -s 217.5.115.7 53 -d 0/0 -p udp -j ACCEPT
-A input -s 194.25.2.129 53 -d 0/0 -p udp -j ACCEPT

-A input -s 0/0 -d 212.95.120.16/29 3050 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.16/29 80 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.16/29 443 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.16/29 21 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.16/29 22 -p tcp -y -j ACCEPT
-A input -s 212.95.120.16/29 -d 0/0 -p tcp -j ACCEPT
-A input -s 212.95.120.16/29 -d 0/0 -p udp -j ACCEPT

-A input -s 0/0 -d 212.95.120.192/27 80 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.192/27 443 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.192/27 21 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.192/27 22 -p tcp -y -j ACCEPT
-A input -s 212.95.120.192/27 -d 0/0 -p tcp -j ACCEPT
-A input -s 212.95.120.192/27 -d 0/0 -p udp -j ACCEPT

-A input -s 0/0 -d 212.95.120.32/28 80 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.32/28 443 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.32/28 21 -p tcp -y -j ACCEPT
-A input -s 0/0 -d 212.95.120.32/28 22 -p tcp -y -j ACCEPT
-A input -s 212.95.120.32/28 -d 0/0 -p tcp -j ACCEPT
-A input -s 212.95.120.32/28 -d 0/0 -p udp -j ACCEPT


-A input -s 0/0 -d 0/0 -p tcp -y -j REJECT
-A input -s 0/0 -d 0/0 -p udp -j REJECT

-A forward -s 0/0 -d 212.95.120.16/29 80 -p tcp  -j ACCEPT
-A forward -s 0/0 -d 212.95.120.16/29 443 -p tcp -j ACCEPT
-A forward -s 0/0 -d 212.95.120.16/29 21 -p tcp -j ACCEPT
-A forward -s 0/0 -d 212.95.120.16/29 22 -p tcp -j ACCEPT
-A forward -s 212.95.120.16/29 -d 0/0 -p tcp -j ACCEPT
-A forward -s 212.95.120.16/29 -d 0/0 -p udp -j ACCEPT
Avatar of The--Captain
Having problems finding a problem w/ the above - are all your subnets on *physically* seperate segments (different hubs/switches)?

BTW, I don't think you need the -y option if you're using iptables w/ conntrack.

Cheers,
-Jon
Yes, of course they are in separated segments :)

Avatar of pjb1008
pjb1008

Have you tcpdumped this yet. (Use -n to avoid it doing DNS lookups during the dump.) That's what's most likely to reveal the problem.

What are you doing that demonstrates the slow behaviour?
Which machines are you pinging from and to?

Also, what are you doing with the other physical interfaces? There are configurable limits for the size of the arp table in /proc/sys/net/ipv4/neigh/default/gc_thresh[123]. Exceed the limits and it'll overflow - you'll start doing unnecessary arp requests to repopulate the table.

There's also a potential problem with icmp redirects if you have a stub network on one side of your firewall and multiple gateways on the other. When a client in the stub network sends a packet to the default gateway, the gateway may send an icmp redirect back to the client telling it to use a different gateway; the client updates its routing table, which is useless since you actually needed to update the routing table on the firewall. When you ping from the firewall itself, the same happens, but this time the firewall's routing table is (correctly) updated. The improved performance this yields lasts until the redirect expires from the routing table. The solution is to install a routing daemon on the firewall, or program it with a _complete_ static routing table, thus avoiding the redirects from being emitted in the first place.

The above are just stabs in the dark; until you monitor what's happening on the network, you are unlikely to spot the problem.
To pjb1008:

I am not so familiar with tcpdump, but I have made it on one of the interfaces and it looks at least for me ok :) May be you shoult give me a guess what should i notice. I can see that the packeges are coming and going, but i don't know what i must watch.

I made pings from machines which are not in any of the above networks. The ping time to the router and to the firewall is quite ok, about max 1 msec, but to any of the machines, inside the protected networks is at least 200 msecs !!!

Also ping time from any machine in the protected network to the firewall is again so big - approx. 200-300 msecs. The same is from the firewall to any of them.

Ping time between any machine in one and the same segment is ok - less than 1 msec.

So, any ideas ?

Thanks in advance.


You are looking for anything anomalous that might indicate a cause of the problem. Things like arp requests more frequently than you would expect, or higher levels of traffic than seem plausible.

To pjp1008 :

Here is a sample of tcpdump on the interface eth1, which is connecting the firewall to the router :


tcpdump: listening on eth1
13:10:05.014518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:05.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:05.884518 212.95.120.19.64536 > 212.227.126.167.pop3: S 4072127426:4072127426(0) win 32767 <mss 536,nop,wscale 0,nop,nop,sackOK> (DF)
13:10:05.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:05.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:05.964518 212.227.126.167.pop3 > 212.95.120.19.64536: S 597955472:597955472(0) ack 4072127427 win 5840 <mss 1460> (DF)
13:10:05.964518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:06.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:06.884518 212.95.120.19.64536 > 212.227.126.167.pop3: . ack 1 win 32767 (DF)
13:10:06.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:06.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:07.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:07.884518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:07.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:07.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:08.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:08.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:08.894518 212.95.120.19.62381 > 192.168.4.1.ntp:  v1 client strat 0 poll 0 prec 0
13:10:08.954518 212.95.98.62 > 212.95.120.19: icmp: net 192.168.4.1 unreachable
13:10:08.954518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:09.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:09.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:09.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:10.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:10.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:11.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:11.884518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:11.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:11.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:12.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:12.884518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:12.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:12.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:13.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:13.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:14.124518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:14.514518 62.47.2.220.3359 > 212.95.120.248.http: S 2039285923:2039285923(0) win 16384 <mss 1360,nop,nop,sackOK> (DF)
13:10:14.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:15.014518 0:8:a3:6c:e4:de 0:8:a3:6c:e4:de loopback 60:
                   0000 0100 0000 0000 0000 0000 0000 0000
                   0000 0000 0000 0000 0000 0000 0000 0000
                   0000 0000 0000 0000 0000 0000 0000
13:10:15.014518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:15.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:15.894518 212.95.120.19.62381 > 192.168.4.1.ntp:  v1 client strat 0 poll 0 prec 0
13:10:16.024518 212.227.126.167.pop3 > 212.95.120.19.64536: P 1:24(23) ack 1 win 5840 (DF)
13:10:16.024518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:16.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:16.884518 212.95.120.19.64536 > 212.227.126.167.pop3: . ack 24 win 32744 (DF)
13:10:16.884518 212.95.120.19.64536 > 212.227.126.167.pop3: P 1:22(21) ack 24 win 32744 (DF)
13:10:16.884518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:16.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:16.974518 212.227.126.167.pop3 > 212.95.120.19.64536: . ack 22 win 5840 (DF)
13:10:16.974518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:16.974518 212.227.126.167.pop3 > 212.95.120.19.64536: P 24:29(5) ack 22 win 5840 (DF)
13:10:17.474518 62.47.2.220.3359 > 212.95.120.248.http: S 2039285923:2039285923(0) win 16384 <mss 1360,nop,nop,sackOK> (DF)
13:10:17.474518 212.95.120.19.64536 > 212.227.126.167.pop3: P 22:36(14) ack 29 win 32739 (DF)
13:10:17.514518 212.95.120.5 > 62.47.2.220: icmp: host 212.95.120.248 unreachable [tos 0xc0]
13:10:17.514518 212.95.120.5 > 62.47.2.220: icmp: host 212.95.120.248 unreachable [tos 0xc0]
13:10:17.564518 212.227.126.167.pop3 > 212.95.120.19.64536: P 29:34(5) ack 36 win 5840 (DF)
13:10:17.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:17.884518 212.95.120.19.64536 > 212.227.126.167.pop3: P 36:42(6) ack 34 win 32734 (DF)
13:10:17.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:17.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:18.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:18.884518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:18.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:18.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:19.134518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:19.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:19.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:19.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:20.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:20.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:20.894518 212.95.120.19.64536 > 212.227.126.167.pop3: P 36:42(6) ack 34 win 32734 (DF)
13:10:20.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:20.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:20.984518 212.227.126.167.pop3 > 212.95.120.19.64536: . ack 42 win 5840 (DF)
13:10:20.984518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:21.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:21.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:21.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:22.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:22.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:22.894518 212.95.120.19.62381 > 192.168.4.1.ntp:  v1 client strat 0 poll 0 prec 0
13:10:22.954518 212.95.98.62 > 212.95.120.19: icmp: net 192.168.4.1 unreachable
13:10:22.954518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:23.494518 62.47.2.220.3359 > 212.95.120.248.http: S 2039285923:2039285923(0) win 16384 <mss 1360,nop,nop,sackOK> (DF)
13:10:23.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:23.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:23.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:24.144518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:24.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:24.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:24.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:25.014518 0:8:a3:6c:e4:de 0:8:a3:6c:e4:de loopback 60:
                   0000 0100 0000 0000 0000 0000 0000 0000
                   0000 0000 0000 0000 0000 0000 0000 0000
                   0000 0000 0000 0000 0000 0000 0000
13:10:25.014518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:25.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:25.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:25.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:26.494518 212.95.120.5 > 62.47.2.220: icmp: host 212.95.120.248 unreachable [tos 0xc0]
13:10:26.494518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:26.584518 212.227.126.167.pop3 > 212.95.120.19.64536: P 34:43(9) ack 42 win 5840 (DF)
13:10:26.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:26.884518 212.95.120.19.64536 > 212.227.126.167.pop3: P 42:48(6) ack 43 win 32725 (DF)
13:10:26.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:26.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:26.974518 212.227.126.167.pop3 > 212.95.120.19.64536: . ack 48 win 5840 (DF)
13:10:26.974518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:26.974518 212.227.126.167.pop3 > 212.95.120.19.64536: P 43:48(5) ack 48 win 5840 (DF)
13:10:27.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:27.894518 212.95.120.19.64536 > 212.227.126.167.pop3: F 48:48(0) ack 48 win 32720 (DF)
13:10:27.894518 212.95.120.19.64537 > 212.227.126.167.pop3: S 4077667732:4077667732(0) win 32767 <mss 536,nop,wscale 0,nop,nop,sackOK> (DF)
13:10:27.964518 212.227.126.167.pop3 > 212.95.120.19.64536: . ack 49 win 5840 (DF)
13:10:27.964518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:27.974518 212.227.126.167.pop3 > 212.95.120.19.64537: S 633384368:633384368(0) ack 4077667733 win 5840 <mss 1460> (DF)
13:10:28.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:28.884518 212.95.120.19.64537 > 212.227.126.167.pop3: . ack 1 win 32767 (DF)
13:10:28.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:28.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:29.014518 212.227.126.167.pop3 > 212.95.120.19.64537: P 1:24(23) ack 1 win 5840 (DF)
13:10:29.014518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:29.144518 212.95.120.19.64537 > 212.227.126.167.pop3: P 1:22(21) ack 24 win 32744 (DF)
13:10:29.234518 212.227.126.167.pop3 > 212.95.120.19.64537: . ack 22 win 5840 (DF)
13:10:29.234518 212.227.126.167.pop3 > 212.95.120.19.64537: P 24:29(5) ack 22 win 5840 (DF)
13:10:29.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:29.894518 212.95.120.19.64537 > 212.227.126.167.pop3: P 22:36(14) ack 29 win 32739 (DF)
13:10:29.894518 212.95.120.19.62381 > 192.168.4.1.ntp:  v1 client strat 0 poll 0 prec 0
13:10:29.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:29.954518 212.95.98.62 > 212.95.120.19: icmp: net 192.168.4.1 unreachable
13:10:30.024518 212.227.126.167.pop3 > 212.95.120.19.64537: . ack 36 win 5840 (DF)
13:10:30.114518 212.227.126.167.pop3 > 212.95.120.19.64537: P 29:34(5) ack 36 win 5840 (DF)
13:10:30.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:30.884518 212.95.120.19.64537 > 212.227.126.167.pop3: P 36:42(6) ack 34 win 32734 (DF)
13:10:30.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:30.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:30.984518 212.227.126.167.pop3 > 212.95.120.19.64537: . ack 42 win 5840 (DF)
13:10:30.984518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:31.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:31.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:31.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:32.824518 212.227.126.167.pop3 > 212.95.120.19.64537: P 34:43(9) ack 42 win 5840 (DF)
13:10:32.824518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:32.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:32.884518 212.95.120.19.64537 > 212.227.126.167.pop3: P 42:48(6) ack 43 win 32725 (DF)
13:10:32.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:32.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:32.974518 212.227.126.167.pop3 > 212.95.120.19.64537: P 43:48(5) ack 48 win 5840 (DF)
13:10:32.974518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:32.984518 212.227.126.167.pop3 > 212.95.120.19.64537: F 48:48(0) ack 48 win 5840 (DF)
13:10:32.984518 212.95.120.19.64537 > 212.227.126.167.pop3: F 48:48(0) ack 48 win 32720 (DF)
13:10:33.084518 212.227.126.167.pop3 > 212.95.120.19.64537: . ack 49 win 5840 (DF)
13:10:33.084518 212.95.120.19.64537 > 212.227.126.167.pop3: . ack 49 win 32720 (DF)
13:10:33.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:33.894518 212.95.120.19.64535 > 194.25.134.26.pop3: S 4079112795:4079112795(0) win 32767 <mss 536,nop,wscale 0,nop,nop,sackOK> (DF)
13:10:34.154518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:34.884518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:34.894518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:34.894518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:34.964518 212.95.120.5.32770 > 141.1.1.12.domain:  35623+ A? www.rhns.redhat.com. (37) (DF)
13:10:34.964518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:35.014518 0:8:a3:6c:e4:de 0:8:a3:6c:e4:de loopback 60:
                   0000 0100 0000 0000 0000 0000 0000 0000
                   0000 0000 0000 0000 0000 0000 0000 0000
                   0000 0000 0000 0000 0000 0000 0000
13:10:35.034518 141.1.1.12.domain > 212.95.120.5.32770:  35623 7/3/3 CNAME[|domain]
13:10:35.044518 212.95.120.5.33426 > 216.148.218.170.https: S 605477685:605477685(0) win 5840 <mss 1460,sackOK,timestamp 61787192 0,nop,wscale 0> (DF)
13:10:35.254518 216.148.218.170.https > 212.95.120.5.33426: S 626748777:626748777(0) ack 605477686 win 32120 <mss 1460,nop,wscale 0> (DF)
13:10:35.254518 212.95.120.5.33426 > 216.148.218.170.https: . ack 1 win 5840 (DF)
13:10:35.254518 212.95.120.5.33426 > 216.148.218.170.https: P 1:125(124) ack 1 win 5840 (DF)
13:10:35.564518 216.148.218.170.https > 212.95.120.5.33426: P 1:1461(1460) ack 125 win 32120 (DF)
13:10:35.564518 212.95.120.19.63164 > 216.136.233.138.http: P 3437986477:3437986517(40) ack 1152812932 win 16787 (DF)
13:10:35.564518 212.95.120.5.33426 > 216.148.218.170.https: . ack 1461 win 8760 (DF)
13:10:35.574518 216.148.218.170.https > 212.95.120.5.33426: P 1461:1616(155) ack 125 win 32120 (DF)
13:10:35.574518 212.95.120.5.33426 > 216.148.218.170.https: . ack 1616 win 8760 (DF)
13:10:35.854518 212.95.120.5.33426 > 216.148.218.170.https: P 125:315(190) ack 1616 win 8760 (DF)
13:10:35.884518 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:35.894518 217.80.207.130 > 212.95.120.18: icmp: echo request (DF)
13:10:35.894518 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:36.084518 216.148.218.170.https > 212.95.120.5.33426: . ack 315 win 31930 (DF)
13:10:36.084518 212.95.120.18 > 217.80.207.130: icmp: echo reply
13:10:36.114518 216.148.218.170.https > 212.95.120.5.33426: P 1616:1667(51) ack 315 win 32120 (DF)
13:10:36.114518 212.95.120.5.33426 > 216.148.218.170.https: . ack 1667 win 8760 (DF)
13:10:36.114518 212.95.120.5.33426 > 216.148.218.170.https: P 315:392(77) ack 1667 win 8760 (DF)
13:10:36.114518 212.95.120.5.33426 > 216.148.218.170.https: . 392:1852(1460) ack 1667 win 8760 (DF)

179 packets received by filter
0 packets dropped by kernel
The key to this is to look at the timestamps.
The precision of the timing is obviously only 1ms, so ignore the "518" in all the timestamps.

Notice what happens when you look at pings alone:

13:10:05.014 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:05.894 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:05.964 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:06.884 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:06.894 212.95.120.19 > 217.80.207.130: icmp: echo reply
13:10:07.894 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:08.894 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:08.894 212.95.120.19 > 217.80.207.130: icmp: echo reply
1

Sometimes, pings are just slow. Sometimes they are take exactly 1 second. This is very significant - it shows that the timing of data coming in is highly dependent on going out. In particular, the reply to the previous ping isn't noticed until the next ping is sent.

Now look at the timing of the pings that were only slightly delayed:

13:10:05.894 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:05.964 212.227.126.167.pop3 > 212.95.120.19.64536: S 597955472:597955472(0) ack 4072127427 win 5840 <mss 1460> (DF)
13:10:05.964 212.95.120.19 > 217.80.207.130: icmp: echo reply

So the reply to this ping was received at _exactly_ the same time as some other unrelated packet was sent. Look again:

13:10:06.884 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:06.884 212.95.120.19.64536 > 212.227.126.167.pop3: . ack 1 win 32767 (DF)
13:10:06.894 212.95.120.19 > 217.80.207.130: icmp: echo reply

The reply was received at _exactly_ the same time as some other unrelated packet was sent. And again:

13:10:16.894 217.80.207.130 > 212.95.120.19: icmp: echo request (DF)
13:10:16.974 212.227.126.167.pop3 > 212.95.120.19.64536: . ack 22 win 5840 (DF)
13:10:16.974 212.95.120.19 > 217.80.207.130: icmp: echo reply

Your network has the property that it performs better under high load than when idle!

An ethernet card issues an interrupt so that it can be serviced when:
a) there is data to be sent and transmit buffer is no longer full, or
b) the receive buffer is no longer empty

For some reason (b) is not happening - the card isn't read until it is serviced for some other reason.

For ISA cards, the most likely cause would be wrong interrupt number specified as an argument to the driver.

This isn't kind of problem isn't supposed to happen with PCI. It could be due to dirt on the contacts of the PCI slot (very unlikely) or a faulty ethernet card, or a faulty device driver.

You have 6 ethernet cards. If you are using PCI on a motherboard with an XT-PIC, this is likely to cause IRQs to be shared. Some device drivers don't like that. They should tolerate it since PCI _requires_ that IRQs are shareable, but not all drivers are well written. Try removing some of the cards to avoid sharing.

Before spending any money replacing cards, upgrade to the latest device driver (ie. upgrade the kernel). If that doesn't fix it, replace the ethernet card with one of a different type.
Yes, you are right. When there is some traffic in the network, the routing is quite ok, bit if there are no other requests, only one simple to some of the machines in the protected network, the horror becomes ...

That's why I have made a permanent ping from one machine outside the protected networks to 2 of the machines inside it and this almost solves the problem, now the speed of the network is quite better than before, but it is still not at the best state. On the router I have 100 Mb/sec line and when I am trying to access it through another line (2Mbit), connected to the same provider, the best speed which I can get is around 20-30 KB/sec.

I will try to change some of the network cards and to see if there is a difference.

Thanks
By the way there is the output of ifconfig, here you can see that the IRQs are not shared :

eth0      Link encap:Ethernet  HWaddr 00:80:C8:F6:79:6C
          inet addr:212.95.120.33  Bcast:212.95.120.63  Mask:255.255.255.224
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:293308 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1632507 errors:4 dropped:0 overruns:0 carrier:4
          collisions:5694 txqueuelen:100
          RX bytes:51330919 (48.9 Mb)  TX bytes:2218778891 (2115.9 Mb)
          Interrupt:5 Base address:0x2000

eth1      Link encap:Ethernet  HWaddr 00:80:C8:94:23:FB
          inet addr:212.95.120.5  Bcast:212.95.120.15  Mask:255.255.255.240
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2527646 errors:2 dropped:0 overruns:0 frame:2
          TX packets:2284806 errors:703 dropped:0 overruns:0 carrier:703
          collisions:1490 txqueuelen:100
          RX bytes:1121860128 (1069.8 Mb)  TX bytes:333803906 (318.3 Mb)
          Interrupt:11 Base address:0x4000

eth1:0    Link encap:Ethernet  HWaddr 00:80:C8:94:23:FB
          inet addr:212.95.120.3  Bcast:212.95.120.15  Mask:255.255.255.240
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:11 Base address:0x4000

eth1:1    Link encap:Ethernet  HWaddr 00:80:C8:94:23:FB
          inet addr:212.95.120.4  Bcast:212.95.120.15  Mask:255.255.255.240
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:11 Base address:0x4000

eth2      Link encap:Ethernet  HWaddr 00:00:D1:1E:C9:15
          inet addr:212.95.120.193  Bcast:212.95.120.255  Mask:255.255.255.192
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:28766 errors:1 dropped:0 overruns:0 frame:0
          TX packets:54093 errors:5 dropped:0 overruns:0 carrier:3
          collisions:0 txqueuelen:100
          RX bytes:3007002 (2.8 Mb)  TX bytes:2305470 (2.1 Mb)
          Interrupt:9 Base address:0x6000

eth3      Link encap:Ethernet  HWaddr 00:00:D1:1E:C9:16
          inet addr:212.95.120.17  Bcast:212.95.120.31  Mask:255.255.255.240
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3549038 errors:1 dropped:115 overruns:0 frame:0
          TX packets:2275589 errors:10 dropped:0 overruns:3 carrier:5
          collisions:0 txqueuelen:100
          RX bytes:2505197205 (2389.1 Mb)  TX bytes:510898724 (487.2 Mb)
          Interrupt:11 Base address:0x8000
The other 2 NICs I am not using at the moment and they are not active
ASKER CERTIFIED SOLUTION
Avatar of pjb1008
pjb1008

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank pjb, it seems that the problem was with the IRQs finally. I've removed all 6 NICs, bought 4 new, all different type/manifacturer and not works fine.
Sorry, I meant NOW works fine :)