Link to home
Start Free TrialLog in
Avatar of NoxBestia
NoxBestia

asked on

Server loosing WAN connection every 4 hours, requiring "clear arp" on Cisco 2611 router.

Greetings;

I have one of the strangest networking problems I have seen in my 20+ years of working with computers.  Beginning last Thursday night, every 4 hours (give or take about 15 minutes) my primary server looses communication with the internet.  When this happens, people on the outside can't even get a ping response from the server.  
      The server is running Server 2003.
      The router is an old Cisco 2611 that has been in service for us for about 6 months.
      The router has a single host connection pointing to the server's IP address (10.6.6.1)

Here are some of the behaviors that I have noted:
      The server is accessible by everything on the LAN and vice versa.
      The server can ping the router.
      The server can NOT ping anything beyond the router.
      The router can ping anything on the WAN and on the LAN, including the server.
      The router can be reached from the WAN via its own IP address.
      If I clear the arp cache on the router, then after about 30-60 seconds everything returns to normal for the next four hours.

Here are just a few of the many steps I have taken to try to solve the problem:
      Changed the arp timeout from the default 4 hours to 30 minutes
      Physically removed the server from the LAN and put another server in its place, using the original server's IP address.  (It still failed after 4 hours.)
      Setup a syslog server and set it to "logging history warnings" but it is not giving me any errors.
      The server is connected to a switch before it gets to the router, but directly plugging the server into the router did not stop the problem.

Unless someone can help me figure this out, the only options I see that I am left with are replacing the router or setting up the server to act as the router with RRAS.

Thanks in advance for any help you can give me.
Avatar of Saineolai
Saineolai
Flag of United States of America image

Can you run a debug ip packet on the router when this occurs and determine if the router is passing incoming packets to the server and vice-versa?
Avatar of NoxBestia
NoxBestia

ASKER

I am not sure how to do what you are asking.  Unfortunately, my network abilities are pretty weak and my networking education has been hit and miss.
For now, as a temporary work around until I can find and fix the problem, I have written a small utility that clears the arp cache from the Cisco 2611.  I have set the server to run it once an hour.
SOLUTION
Avatar of trinak96
trinak96

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jan Bacher
Can you sanitize the router config and post it?  And, prior to clearing the ARP entry, have you determined that the MAC address pointing to the IP address is the MAC address of the server?
Now I understand what is needed of me.  Until 6 months ago I did all my routing via Microsoft servers and RRAS, with the exception of some NAT programming on an old Cisco DSL modem.

Anyway, my program last cleared the arp at 8 AM my time so I expect the system to fail about noon (just over 2 hours from now).  I will do as both of you have suggested and post the results then.
Jesper, I missed your comments while I was composing my last reply.  I'll post the router config in about an hour when I get into the office.
Thanks and be sure to document the MAC<->IP information in the ARP table before clearing.
Here is my config file with all passwords removed.  

Also, how long should I run the debug once the problem manifests?

---------------------------------
Using 1689 out of 29688 bytes
!
version 12.1
no service single-slot-reload-enable
service timestamps debug uptime
service timestamps log uptime
no service password-encryption
!
hostname {HIDDEN}
!
enable password {HIDDEN}
!
!
!
!
!
memory-size iomem 10
ip subnet-zero
ip name-server 10.6.6.1
!
!
!
!
interface Ethernet0/0
 description connected to EthernetLAN
 ip address 10.6.6.254 255.255.255.0
 ip helper-address 10.6.6.1
 ip nat inside
 full-duplex
 arp timeout 1800
!
interface Serial0/0
 no ip address
 shutdown
!
interface Ethernet0/1
 description connected to Internet
 ip address 205.158.120.220 255.255.255.224
 ip nat outside
 full-duplex
 arp timeout 1800
!
router rip
 version 2
 passive-interface Ethernet0/1
 network 10.0.0.0
 no auto-summary
!
ip nat pool GPB-ROUTER-natpool-0 205.158.120.220 205.158.120.220 netmask 255.255.255.224
ip nat inside source list 1 pool GPB-ROUTER-natpool-0 overload
ip nat inside source static 10.6.6.1 205.158.120.222
ip classless
ip route 0.0.0.0 0.0.0.0 Ethernet0/1
ip http server
!
logging history warnings
logging facility syslog
logging 10.6.6.5
access-list 1 permit 10.6.6.0 0.0.0.255
snmp-server community public RO
snmp-server enable traps snmp
snmp-server enable traps isdn call-information
snmp-server enable traps isdn layer2
snmp-server enable traps hsrp
snmp-server enable traps config
snmp-server enable traps entity
snmp-server enable traps envmon
snmp-server enable traps bgp
snmp-server enable traps rsvp
snmp-server enable traps frame-relay
snmp-server enable traps rtr
snmp-server enable traps syslog
snmp-server host 10.6.6.5 public
!
line con 0
 exec-timeout 0 0
 password {HIDDEN}
 login
line aux 0
line vty 0 4
 password {HIDDEN}
 login
!
end

Here is a bit of the debug ip packet I ran.  I have saved a lot more of it to a text file if more is needed.

Also, I did a show ip arp eth0/0 and the mac address of my server matched the mac address of the nic on the server.

-------------------------
GPB-ROUTER#debug ip packet
IP packet debugging is on
GPB-ROUTER#term mon
GPB-ROUTER#, g=205.188.158.121, len 48, forward
13:16:25: IP: s=10.6.6.1 (Ethernet0/0), d=10.6.6.254 (Ethernet0/0), len 41, rcvd
 3
13:16:25: IP: s=10.6.6.254 (local), d=10.6.6.1 (Ethernet0/0), len 41, sending
13:16:25: IP: s=10.6.6.1 (Ethernet0/0), d=10.6.6.254 (Ethernet0/0), len 41, rcvd
 3
13:16:25: IP: s=10.6.6.254 (local), d=10.6.6.1 (Ethernet0/0), len 41, sending
13:16:25: IP: s=10.6.6.1 (Ethernet0/0), d=10.6.6.254 (Ethernet0/0), len 41, rcvd
 3
13:16:25: IP: s=10.6.6.254 (local), d=10.6.6.1 (Ethernet0/0), len 41, sending
13:16:25: IP: s=10.6.6.1 (Ethernet0/0), d=10.6.6.254 (Ethernet0/0), len 40, rcvd
 3
13:16:26: IP: s=10.6.6.1 (Ethernet0/0), d=10.6.6.254 (Ethernet0/0), len 42, rcvd
 3
13:16:26: IP: s=10.6.6.254 (local), d=10.6.6.1 (Ethernet0/0), len 42, sending
13:16:26: IP: s=0.0.0.0 (Ethernet0/0), d=255.255.255.255, len 363, rcvd 2
13:16:26: IP: s=0.0.0.0 (Ethernet0/1), d=255.255.255.255, len 363, rcvd 2
13:16:26: IP: s=10.6.6.254 (local), d=10.6.6.1 (Ethernet0/0), len 363, sending
13:16:26: IP: s=10.6.6.254 (local), d=10.6.6.72 (Ethernet0/0), len 342, sending
13:16:27: IP: s=213.51.0.92 (Ethernet0/1), d=205.158.120.220 (Ethernet0/1), len
91, rcvd 3
13:16:27: IP: s=205.158.120.220 (local), d=213.51.0.92 (Ethernet0/1), len 56, se
nding
13:16:27: IP: s=205.158.120.220 (local), d=213.51.0.92 (Ethernet0/1), len 56, en
capsulation failed
13:16:27: IP: s=205.158.120.222 (Ethernet0/0), d=216.250.24.64 (Ethernet0/1), g=
216.250.24.64, len 48, forward
13:16:28: IP: s=205.158.120.222 (Ethernet0/0), d=199.7.66.1 (Ethernet0/1), g=199
.7.66.1, len 58, forward
13:16:28: IP: s=205.158.120.222 (Ethernet0/0), d=64.12.51.132 (Ethernet0/1), g=6
4.12.51.132, len 66, forward
13:16:28: IP: s=205.158.120.222 (Ethernet0/0), d=205.188.158.121 (Ethernet0/1),
g=205.188.158.121, len 48, forward
13:16:28: IP: s=205.158.120.220 (Ethernet0/0), d=204.74.113.1 (Ethernet0/1), g=2
04.74.113.1, len 58, forward
13:16:28: IP: s=204.74.113.1 (Ethernet0/1), d=10.6.6.5 (Ethernet0/0), g=10.6.6.5
, len 127, forward
13:16:28: IP: s=205.158.120.220 (Ethernet0/0), d=192.12.94.30 (Ethernet0/1), g=1
92.12.94.30, len 63, forward
13:16:28: IP: s=192.12.94.30 (Ethernet0/1), d=10.6.6.5 (Ethernet0/0), g=10.6.6.5
, len 151, forward
13:16:28: IP: s=205.158.120.220 (Ethernet0/0), d=207.36.174.5 (Ethernet0/1), g=2
07.36.174.5, len 58, forward
13:16:28: IP: s=205.158.120.220 (Ethernet0/0), d=207.36.174.5 (Ethernet0/1), len
 58, encapsulation failed
13:16:28: IP: s=204.9.177.18 (Ethernet0/1), d=10.6.6.30 (Ethernet0/0), g=10.6.6.
30, len 40, forward
13:16:30: IP: s=84.190.236.161 (Ethernet0/1), d=205.158.120.220 (Ethernet0/1), l
en 91, rcvd 3
13:16:30: IP: s=205.158.120.220 (local), d=84.190.236.161 (Ethernet0/1), len 56,
 sending
13:16:30: IP: s=205.158.120.220 (local), d=84.190.236.161 (Ethernet0/1), len 56,
 encapsulation failed
13:16:30: IP: s=10.6.6.4 (Ethernet0/0), d=10.6.6.255 (Ethernet0/0), len 78, rcvd
 3
13:16:30: IP: s=205.158.120.222 (Ethernet0/0), d=216.250.24.64 (Ethernet0/1), g=
216.250.24.64, len 48, forward
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
As Saineolai says, route your traffic direct to a next hop address, i had this happening only a couple of days ago, routed to next hop address and arp cache down to 20ish lines, instead of hundreds!

Just do a tracert to a website to find the next hop withing the ISP's network.

Adrian...
Saineolai: I have added the access list information to my configuration.  As for routing, I am afraid that this configuration was created by a configuration program and then has been tweaked by me here and there as I learned things.  I do not know how to route traffic as you and Adrian suggest.

Adrian: My ISP provided gateway is 205.158.120.193.  Is that what you are suggesting I use?  Also, what is the syntax for setting up this routing correctly?  I don't want to make things worse by doing it wrong.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Jesper: after entering the two lines you gave me I lost connection to the outside world.  I went back to my previous config and added "ip route 0.0.0.0 0.0.0.0 Ethernet0/1" to get back online.
Things are now also running painfully slow for all computers here, including the server.
Do you see 205.158.120.193 in the ARP table, 'show arp' ?

Can you verify that 205.158.120.193 is the gateway?

Can you ping 205.158.120.193?
Post your arp table as jesper suggested, also your latest config, plus your trace route. Did you do a "clear arp" after changing the routing ?
Show arp includes this line:  Internet  205.158.120.193         0   Incomplete      ARPA  

205.158.120.193 is the gateway according to XO and I have used it as the gateway when using RRAS.

I go not get responses when attempting to ping 205.158.120.193

A tracerout to google from the server shows:

tracert www.google.com

Tracing route to www.l.google.com [64.233.167.104]
over a maximum of 30 hops:

  1     1 ms     1 ms     1 ms  10.6.6.254
  2     2 ms     2 ms     2 ms  205.158.120.193
  3    10 ms    10 ms    10 ms  67.108.229.229
  4     9 ms     9 ms     9 ms  ge5-0-0.mar2.englewood-co.us.xo.net [207.88.83.2
1]
  5     9 ms     9 ms     9 ms  65.106.6.25.ptr.us.xo.net [65.106.6.25]
  6    33 ms    33 ms    33 ms  p1-0-0.rar1.chicago-il.us.xo.net [65.106.0.26]
  7    32 ms    33 ms    33 ms  p0-0.ir1.chicago2-il.us.xo.net [65.106.6.134]
  8    33 ms    33 ms    33 ms  206.111.2.18
  9    33 ms    33 ms    33 ms  205.171.139.145
 10    33 ms    33 ms    33 ms  chx-edge-01.inet.qwest.net [205.171.139.162]
 11    35 ms    33 ms    33 ms  63.144.64.134
 12    33 ms    33 ms    33 ms  216.239.48.154
 13    34 ms    34 ms    34 ms  66.249.94.135
 14    34 ms    34 ms    34 ms  72.14.232.74
 15    69 ms    39 ms    34 ms  py-in-f104.google.com [64.233.167.104]

Trace complete.

A tracerout from the router shows:

Translating "www.google.com"...domain server (10.6.6.1) [OK]

Type escape sequence to abort.
Tracing the route to www.l.google.com (64.233.167.104)

  1 205.158.120.193.ptr.us.xo.net (205.158.120.193) 0 msec *  0 msec
  2  *  *
    67.108.229.229.ptr.us.xo.net (67.108.229.229) 8 msec
  3  *
    ge5-0-0.mar2.englewood-co.us.xo.net (207.88.83.21) 8 msec *
  4 65.106.6.25 20 msec 224 msec *
  5 p1-0-0.rar1.chicago-il.us.xo.net (65.106.0.26) 52 msec *  32 msec
  6  *  *
    p0-0.ir1.chicago2-il.us.xo.net (65.106.6.134) 32 msec
  7  *  *
    206.111.2.18.ptr.us.xo.net (206.111.2.18) 33 msec
  8  *
    cer-core-02.inet.qwest.net (205.171.139.149) 297 msec *
  9  *  *
    chx-edge-01.inet.qwest.net (205.171.139.166) 40 msec
 10  *
    63.144.64.134 69 msec 32 msec
 11  *  *  *
 12  *
    72.14.232.53 248 msec *
 13  *  *  *
 14  *
    www.l.google.com (64.233.167.104) 68 msec *
Here is my current config.  NOTE: I had to add "ip route 0.0.0.0 0.0.0.0 Ethernet0/1" back in to get ANY connection with the outside world at all.

-----
Using 1920 out of 29688 bytes
!
version 12.1
no service single-slot-reload-enable
service timestamps debug uptime
service timestamps log uptime
no service password-encryption
!
hostname GPB-ROUTER
!
enable password {HIDDEN}
!
!
!
!
!
memory-size iomem 10
ip subnet-zero
ip name-server 10.6.6.1
!
!
!
!
interface Ethernet0/0
 description connected to EthernetLAN
 ip address 10.6.6.254 255.255.255.0
 ip helper-address 10.6.6.1
 ip nat inside
 full-duplex
 arp timeout 1800
!
interface Serial0/0
 no ip address
 shutdown
!
interface Ethernet0/1
 description connected to Internet
 ip address 205.158.120.220 255.255.255.224
 ip nat outside
 full-duplex
 arp timeout 1800
!
router rip
 version 2
 passive-interface Ethernet0/1
 network 10.0.0.0
 no auto-summary
!
ip nat pool GPB-ROUTER-natpool-0 205.158.120.220 205.158.120.220 netmask 255.255.255.224
ip nat inside source list 1 pool GPB-ROUTER-natpool-0 overload
ip nat inside source static 10.6.6.1 205.158.120.222
ip classless
ip route 0.0.0.0 0.0.0.0 205.158.120.193
ip route 0.0.0.0 0.0.0.0 Ethernet0/1
ip http server
!
logging history warnings
logging facility syslog
logging 10.6.6.5
access-list 1 permit 10.6.6.0 0.0.0.255
access-list 105 permit ip any host 10.6.6.1
access-list 105 permit ip any host 205.158.120.222
access-list 105 permit ip host 205.158.120.222 any
access-list 105 permit ip host 10.6.6.1 any
snmp-server community public RO
snmp-server enable traps snmp
snmp-server enable traps isdn call-information
snmp-server enable traps isdn layer2
snmp-server enable traps hsrp
snmp-server enable traps config
snmp-server enable traps entity
snmp-server enable traps envmon
snmp-server enable traps bgp
snmp-server enable traps rsvp
snmp-server enable traps frame-relay
snmp-server enable traps rtr
snmp-server enable traps syslog
snmp-server host 10.6.6.5 public
!
line con 0
 exec-timeout 0 0
 password {HIDDEN}
 login
line aux 0
line vty 0 4
 password {HIDDEN}
 login
!
end
I ended up cold booting my router, XO's router, and the switch between them.  Currently I seem to have internet access back at normal speeds, still running the above referenced configuration.
Hi,

This doesnt seem correct, to me anyway :

ip nat pool GPB-ROUTER-natpool-0 205.158.120.220 205.158.120.220 netmask 255.255.255.224 ------>TRYING TO NAT TO ITSELF ?
ip nat inside source list 1 pool GPB-ROUTER-natpool-0 overload
ip nat inside source static 10.6.6.1 205.158.120.222 ------> WHY .222 ?

If your only accepting requests from 10.6.6.1 (proxy server i presume) then just NAT this address to the outside.

Remove your 3 lines above and replace with :
ip nat inside source static 10.6.6.1 interface fa0/1 overload

Also, on your interfaces, add:
no ip proxy-arp
no ip redirects
no ip mroute-cache

then "clear arp"
Jesper: I finally got your configuration changes to work last night.  My ARP tables now just exist for internal clients, the WAN gateway, the WAN shared address, and the WAN Host address.

Adrian:  10.6.6.1 is an email and web server.  We do not filter our users though a proxy.  205.158.120.220 is the shared NAT address for everyone on the LAN and I use it to gain access to the router when I am working form home.  205.158.120.222 goes to the 10.6.6.1 server only.  Before I followed any of your ideas I wanted to let you know what our architecture was and also let you know the problem ahs been resolved.  If you still think that I should alter my config further, Im definitely up for your suggestions.

EVERYONE:  thank you for all your help!  This is the first time I have asked a question here, so please bear with me as I try to close it and award points fairly.