Link to home
Start Free TrialLog in
Avatar of BobintheNoc
BobintheNocFlag for United States of America

asked on

How get a good overview and how to manage with graphical view

I've inherited a small LAN to administer, which has multiple 2900 series catalyst switches (3 total) and a 3524xl as the main. The configuration is highly suspect in causing various internal outages and slow communications.  I am trying to get a good mental picture of how this LAN is configured, but can't make out full details.

Is there a halfway decent graphical tool that can give me an idea of how things work at the switching level?  There is VLAN1, where mgmt interfaces exist, along with the primary client/server network running at 192.168.1.0/24.  There is also, apparently, a DMZ sort of VLAN that hosts a few servers that are exposed to the internet, with limited connectivy back into VLAN1.  I'm not even sure which ports connect up yet, to the various devices, still doing discovery.  There's also a Juniper firewall/gateway in the mix, bringing in two T-1 circuits.

Some of the servers within VLAN one have multiple NICs plugged in, and some servers with single NICs have multiple IP addresses.  DNS is a mess, as is WINS, with no really accuracy of which hosts are which.  I'm beginning the cleanup of the WINS/DHCP/DNS to try to get that back on track.

I've also found many auto negotiated ports for speed/duplex that I've corrected that were getting tons of collisions and other error packets.

When I do a sho cluster, only two of the switches show, the 3523 and one 29xx.  The other two don't indicate cluster membership.

The switches are interconnected via fiber Gig, daisy chained from one to the next.

I'd like to make sure that there aren't any looping conditions going on that may stop or break or significantly delay communications.  Frequently, users are disconnected from their server drive mappings, and sometimes, outgoing emails take hours to reach their destinations, including emails destined for internal deliveries (like NDRs)

When doing SHO int on each switch, most ports indicate multiple interface resets throughout a day, and frequently, PCs/Servers indicate a physical disconnection then reconnect, with anywhere from a few seconds to a few minutes before reconnection.

I'm hoping to find a graphical mapping tool that either simply will display the logical/physical layout, and maybe something that will show problematic situations like a port going offline or a loop taking place.  It'd be really nice to be able to manage the group of switches as a single, whole unit, to where I didn't have to think about each switch as a separate object.   I'm not that familiar with IOS, have picked up a few commands over the years, but no real understanding of portfast, spanning tree, etc.  I do understand IP routing and do have a solid understanding of TCP/IP and services/tcp/udp ports, etc, and significant experience in packet analysis.  I have collected many captures that show definite issues where communications enters one NIC on a server but then leaves out of a 2nd nic to go back to the client.

Also, any recommendations of online reference material that'd be good to start with in trying to understand this fabric.

Thanks!

Bob
ASKER CERTIFIED SOLUTION
Avatar of Pugglewuggle
Pugglewuggle
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi,

You can also check out the below applications;

LAN Surveyor                    www.solarwinds.com/products/lansurveyor/
Dude                                 www.mikrotik.com/thedude.php
Adventnet OPManager      www.adventnet.com
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of BobintheNoc

ASKER

Thank you for your responses, I have begun exploring with the Cisco Network Assistant.  As suspected, it does show several ports that are going up and down, from devices that shouldn't.  The details of how/why though, aren't very descriptive.  I think I'll set up an snmp trap destination and see if the switch devices report more accurate information via snmp.  Oddly, the majority of log events don't have a timestamp, making the historical log difficult to interpret.  Of course, the price of FREE was the deciding factor, thanks for the additional info on the other suggested products and their prices.
Of the four primary switches, only three were clustered.  I've not yet explored the 4th switch enough to know if it SHOULD have been clustered or not.  The CNA recommends making this cluster a community.  Does that actually modify the switch configuration?  Or is it just a management view within CNA?  I'll do some reading to discover the differences between Clusters and Communities, to see especially if the functionality for the vlans/ port configs get modified/changed, and whether or not the actual end result is perceivable by the client systems.

Thanks again!
Thanks all for the rapid feedback.  The CNA is running, and have confirmed some suspicions about various ports being deactivated and or ports dropping connections.  The detail from the alerts isn't very thorough, mostly just indicating port down and then up, so not sure if it's Spanning Tree that's dropping them, or perhaps faulty cabling/power.  The referenced ports all are server systems, I haven't yet determined if they're looping potential machines, hope to tomorrow.  Is there a preference or better suited mode than clustering?  The CNA is recommending a conversion of the cluster to a community?  Will read up on that shortly.  Will also likely configure an snmp trap destination to collect SNMP, figuring that it might offer more 'real time' and detailed information?  Thanks again, appreciate the help.
I'm not sure if making it a community affects the config or not. I don't believe so. I think it's just a logical setup in the CNA.
So, now that CNA is in place, a few oddities pop.  First, I see frequent event messages indicating a port goes down and then back up.  Happening on multiple ports, however details from cna is virtually nonexistent.  I'm guessing that the ws3524xl, and the 2950s have very limited support compared to the newest switches.

Two of the switches are reporting failed redundant power, however both switches only have a singe power supply, and the uptime is maintained.  Happens 3-5 times per 24 hrs.  Just bad interpretation bt cna..

TIA,
oh, btw, if those semi frequent link drop detections are due to being clocked by Spanning Tree, what's the best way to determine the source of the shutdown of that port?