[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 936
  • Last Modified:

Cisco 3845 / Arp issue

Okay so this is a tough one, and as such I'm going to reward the highest amount of points for it. On the same token, I probably will not disclose all of the information the first time around required to help debug this issue. It will require multiple posts which is fine.

This is potentially an arp issue or a VLAN misconfiguration issue. I'm dealing with a 3845 router which sits at our datacenter and serves as a ds3 cross connect to our office. We have two VLANs:

VLAN 50 - web servers (10.50/16)
VLAN 51 - databases (10.51/16)

When trying to access VLAN 51 interfaces from our office subnet,  it can take a considerable amount of time for arp to properly cache on the destination box for 10.50.0.5. It is unreachable and arp -n shows the following for it:

[root@cc70-5 ~]# arp -n |grep 10.50.0.5
10.50.0.5                        (incomplete)                              eth1

You can see the initial ping latency below also (and traceroutes are broken until the arp cache is set after a ping)

tag1349:~ sfinkelstein$ ping 10.51.5.70
PING 10.51.5.70 (10.51.5.70): 56 data bytes
64 bytes from 10.51.5.70: icmp_seq=248 ttl=62 time=1002.861 ms <--- bad
64 bytes from 10.51.5.70: icmp_seq=249 ttl=62 time=2.827 ms
64 bytes from 10.51.5.70: icmp_seq=250 ttl=62 time=1.930 ms
64 bytes from 10.51.5.70: icmp_seq=251 ttl=62 time=1.979 ms
64 bytes from 10.51.5.70: icmp_seq=252 ttl=62 time=1.824 ms

If I set the arp entry manually then I never see this issue. I do something like the following as a temporary work around:

arp -s 10.50.0.5 00:15:F9:0C:65:A1 dev eth0

Just another note. If I ping a 10.51 interface from a 10.50 interface, this creates the arp entry right away alleviating the issue from the office subnet not being able to ping it. It'll create the arp cache for an office ping/tcp socket request after 30 seconds to 5 minutes after the initial try.

Thanks again for any assistance and please let me know if there's any other information I can provide with my network topology, router versions/configs etc to help fix this problem.
0
stevefNYC
Asked:
stevefNYC
  • 4
  • 3
1 Solution
 
Don JohnstonInstructorCommented:
So you're saying that if you try to ping a device on VLAN 50 from a device on VLAN 51, you experience this problem, but if you ping the router itself you don't have the delay?

How is the 3845 connected to the VLAN's? Are you trunking to a switch or are you using two seperate interfaces on the 3845?

Do the workstations have a default gateway set or are you using Proxy ARP?

0
 
stevefNYCAuthor Commented:
I'm trying to ping a device on VLAN 51 from the office subnet. If I ping from VLAN 50, it'll properly create the the arp cache on the linux box and allows me to ping. Also for boxes which have both a VLAN 50 and VLAN 51 interface, if I ping the VLAN 50 interface first, I can then make subsequent requests to VLAN 51 without a problem.

Here's some results directly from the router:

nap2gbxds3#ping 10.51.5.66

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.51.5.66, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 1/1/1 ms
nap2gbxds3#ping 10.51.5.70

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.51.5.70, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms
nap2gbxds3#ping 10.51.5.61

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.51.5.61, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 1/1/1 ms
nap2gbxds3#

Some are 80 percent, some are 100.

The 3845 is connected to the VLANs with two seperate interfaces. One RJ45 twisted copper line into a VLAN 50 interface on one of our two edge 6509s and the same for VLAN 51.

I hope this answers your questions, donjohnston.  Thanks a bunch.
0
 
Don JohnstonInstructorCommented:
It is typical for the first ping to fail while the ARP entries are populated at end and intermediate devices. If that's the only problem you're experiencing, I wouldn't worry about it.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
stevefNYCAuthor Commented:
Only the first ping from the router is failing. There is a much bigger issue in which boxes behind the office subnet cannot access hosts on 10.51/16 (VLAN 51) for what can be five minutes unless a prior TCP connection has been made or a static arp cache has been placed onto the destination box for the ds3 router.

tag1349:~ sfinkelstein$ time ping 10.51.5.70
PING 10.51.5.70 (10.51.5.70): 56 data bytes
^C
--- 10.51.5.70 ping statistics ---
423 packets transmitted, 0 packets received, 100% packet loss

real    7m2.833s
user    0m0.004s
sys     0m0.022s

I had to ^C out of there after 7 minutes of waiting. It can take up to 30 minutes sometimes for the arp lookup to take place.
0
 
Don JohnstonInstructorCommented:
>If I set the arp entry manually then I never see this issue. I do something like the following as a temporary work >around:
>
>arp -s 10.50.0.5 00:15:F9:0C:65:A1 dev eth0

When you did this, what device (IP address) did you do this from?

Also, are you setting a default gateway on your end stations?

And just so we know what is what, can you list the IP addresses of the router and some of the end stations you're connunicating between?
0
 
stevefNYCAuthor Commented:
I invoked that arp command on any of my 200 boxes which have a VLAN 51 segment subnet aliased to one of the interfaces. ie: 10.51.3.250 as an example. Yes I am setting the default gateway through each end station through the following:

[root@cc60-5 ~]# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
192.168.100.0   10.50.0.5       255.255.254.0   UG        0 0          0 eth1

You know, now that you mention it .. I removed the static route from the routing table on one linux box. We have a static route set on our Netscalers for this network. There is now no timeout at all. It also never caches the arp for the ds3 router for which packets traverse, but I *think* I'm able to ping VLAN 51 interfaces without any issues now that I removed the static route from the local linux boxes.

That is totally weird!  Any idea why? let me confirm and I'll reward you the points for your generous help, donjohnston.

Steve
0
 
stevefNYCAuthor Commented:
Feel free to put this in the clean up area, keith.

Thank you.

S.
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now