Link to home
Start Free TrialLog in
Avatar of travisryan
travisryanFlag for United States of America

asked on

Cisco 3750 3650 switch SSH issue

My network infrastructure is set up with several 3750-Xs as my switch stack. From there I have several satellite 3650 switches that connect back to the core via two fiber pairs set up as a port channel. To manage my switches I have a separate management VLAN.

Here's my issue: while performing on something unrelated I noticed I could no longer SSH to one of my switches from either my linux machine nor my Win 7 machine (neither of these machines have an IP in the management VLAN). This switch trunks back to the core switch stack (like several others) and also has two switches that trunk into it to get back to the switch stack (they're "farther out" so to speak). I can ssh into those just fine, then ssh "back" into the switch I can no longer SSH into from my desktop machines.

At first I couldn't even SSH into the problem switch from other switches "closer" to the core switch stack including the core switch stack itself, then (through no change I made, and I'm the only one who should be working on these switches) suddenly I could.

Troubleshooting this further:
-I can ping all of the management IPs from my desktops besides the problem switch
-I can ping the problem switch's management IP from all of my switches, even when I couldn't connect into it from some of them
-SSH debug shows nothing helpful
-All switches have the same version of SSH
-Checking the allowed VLANs, the managment VLAN is allowed on the trunk heading to the problem switch from the core stack
-I keep versioning history on all of my switches and this switch's config hasn't been changed for at least 4 weeks, even then, any of the changes made to the problem switch or the core switch within the last year have had nothing to do SSH communications and I know I've SSH'd into the switch recently with no issue.

This one is a head scratcher for me. Any help is appreciated.
Avatar of lruiz52
lruiz52
Flag of United States of America image

Can you post a sanitized config of the problem switch and of one of the switches that you can ssh to from your desktop machine?
Avatar of eeRoot
eeRoot

During the times that you cannot SSH into this switch, can you ping it from your workstation and/or the core switch?  Can you verify that all of your switches have the same subnet for the switch management VLAN and default gateway set?
If you can't ping the problem switch from the work station, the problem doesn't necessarily mean it is related to that switch. It sounds like a potential routing or cable issue.

Have you tried a different management IP on the problem switch or looked at the routing within your management VLAN?
Avatar of travisryan

ASKER

After doing some more troubleshooting on this it gets stranger, but more specific:
-The switch can ping the other switches via their management VLAN IPs
-It cannot ping any address on another VLAN, eventhough the all of the VLANs gateways are on the switch stack
-Traceroute yields no extra information
-No results for sh ip redirect
-From my Linux machine ssh -v ip address just shows "connection timeout"
@eeRoot, all switches are set with a ip default-gateway

@Luke Smith, what do you mean by "routing within the management VLAN"?
After more troubleshooting it looks like the problem switch can't even ping its own default gateway. It can ping the gateway of the management VLAN, but not it's own gateway. This is more confusing because there's several devices on this switch that can communicate with several subnets just fine, on top of the fact that two other switches sit behind/farther away from the core switch and they can communicate just fine.
Now it is starting to sound like the VLAN db might be corrupt for the problem switch. From a device in the network, can you do a "show ip route" of the IP of the problem switch and does it show routes?
@Luke

The sh ip route for the problem switch and all switches besides the core stack is blank. It shows the default gateway, a blank table, and ICMP redirect cache is empty.

is there a way to flush the VLAN db?
@lruiz52 and Luke, below are sanitized configs for the problem switch and one of my working switches. VLAN 20 is for computers/servers, VLAN 200 is the management VLAN.
Problem-Switch-Clean.txt
Good-Switch-Clean.txt
You seem to have two different IP ranges in use on VLAN 200

Good switch =  ip address 10.1.100.6 255.255.255.0
Problem switch =  ip address 20.1.200.7 255.255.255.0

Is VLAN 200 using 10.1.100.x or 20.1.200.x?  I'd assume the only one that works is the one defined in the core switch config.
@eeRoot, this was a santiation issue. 20.1.200 is my management VLAN for this exercise. That should be 20.1.200.6
After troubleshooting this further I've changed the default gateway to 20.1.200.1, which is the gateway for my Management VLAN. Apparently this a proxy arp issue. I still need help fixing it as I don't want all of my traffic flowing over the management VLAN
ASKER CERTIFIED SOLUTION
Avatar of travisryan
travisryan
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This was the best solution