Solved

Switch Upgrade Gone Horribly Awry

Posted on 2011-02-15
22
563 Views
Last Modified: 2012-06-27
So my project last night was to replace two aging and ailing 3Com 10/100 network switches with new NetGear Gigabit Switches.  The old switches were a 48-port and 24-port respectively.  The new switches we purchased were NetGear GS748T.  I THOUGHT the proceedure would be as easy as physically swapping the equipment and giving them enough time to sort out the tables to allow for traffic.  I was mistaken however.  After initial setup of the NetGear switches I decided to try and configure a LAG connection combining 4 of the ports on each of the switches in order to establish a 4 gigabit 'trunk' between the two devices to allow for increased bandwidth between the switches.  I configured this through the web interface and then set about plugging in all 96 devices into the new switches.  However, once powered up, I had next to no network connectivity and what was there, was intemittant at best.  I even went so far as to reset the switches back to factory defaults in hope of establishing some sort of functionality.  The whole objective in upgrading to gigabit across the board was to allow for some of our heavy bandwidth applications.  After struggigling most of the night with this I was forced to swap back to the 3Com switches in order to have things somewhat functional for staff arriving in the morning.  Swapped everything back and network speeds were back to what they were.

Does anyone have any idea what I've done wrong, or am I trying to use these NetGear switches in an environment they do not supoort (though I don't really see how).  

If it adds any clarity, we have two file servers shared to the staff, that's it.  My objective is simply to share the access to the two servers and our internet connection to the 60 ish staff computers and various printers and such. I separated devices based on ones that supported a gig connection on one switch and devices that are perhaps a little older and only support 100 base on the other switch.  The two servers and the router were connected to the first swich will all the other 'power users'.  If any more clarity as to what I've tried is necessary, I'll be happy to provide.  

Any help is desperately appreciated as now I am under the gun to get these gig switches in place.

Thanks.
0
Comment
Question by:urbanstyles
  • 10
  • 4
  • 4
  • +3
22 Comments
 
LVL 33

Expert Comment

by:paulmacd
ID: 34897196
You might try establishing the LAG before you connect any other devices to the switches.  Another issue may be with port speed.  Are the switch ports set to negotiate a data rate?  How about the client NICs?  Is the wiring infrastructure suitable for gigabit speed?
0
 

Author Comment

by:urbanstyles
ID: 34897237
I did in fact establish the LAG prior to attaching the other devices.  I tested connectivity with a couple of laptops on each of the switches just to make sure everything was talking properly.  

All devices, switch ports and NIC's are set for autonegotiation.  Should I be disabling this?  I have no reason to believe the wiring won't support gigabit.  The building is recently renovated, and I have no option but to trust that the electricians did as they say they did.  Would this cause no connectivity, or just slightly less than gigabit?

Thanks.
0
 
LVL 4

Expert Comment

by:snowdog01
ID: 34897239
I agree with paulmacd.  The cabling infrastructure would be the most obvious issue.  The number 1 cardinal rule of deploying mission critical equipment like this is to configure it off-line and make sure it all works before you deploy it.  You will need cat 5e or 6 to run the Gig connections to your servers.  The best practice is to lock-down the port speed/duplex settings for servers, routers and other network gear (do not let them auto-negotiate).  This is a two-way street, so your servers will have to match, which could be the source of your issue.  Check out the port settings and make adjustments as needed for speed and duplex.
0
 
LVL 50

Expert Comment

by:Don Johnston
ID: 34897249
I would agree with Paul. Most likely an LAG issue. In addition to establishing the LAG before bringing up the other devices, make absolutely certain that all ports in the LAG are completely identical (speed, duplex, VLAN membership, tagging, etc.). If they aren't, all bets are off.
0
 

Author Comment

by:urbanstyles
ID: 34897277
The servers are in close proximity to the rack, so I know that cabling is good.  The runs to the workstations are a different story, I suppose there is a possiblity there.  However, I'm experiencing the issue with a laptop plugged directly into the switch, two feet away from the servers.  

Everything works fine, with only a few devices attached.  The problem comes once all ports are filled.  I'd love to be able to do this offline, but unfortunately, after hours is my only option.  I will try to force the speed and duplex and see how that goes.
0
 
LVL 33

Expert Comment

by:paulmacd
ID: 34897290
Cabling can be a factor, but it generally shouldn't prevent communication.  Those are all just troubleshooting steps I would take in a similar situation.  It sounds verly much like you've done everything right.  

If you stage the switches again and notice intermittent connectivity, try bypassing the wiring plant by running spare cables between a couple machines and the switches to see if connectivity improves (or becomes more reliable).  Make sure all your cables are well manufactured too.  By and large, I would expect faulty cabling to simply result in slow transfer speeds, but EMI/RFI can do funny things at gigabit speeds.

Of course, another option would be to set aside the LAG for now and see if that helps.
0
 

Author Comment

by:urbanstyles
ID: 34897305
The issue persists even after blowing away the LAG configuration all together, but yes they are identical.  Basically fresh out of the box, set 4 ports to LAG membership and that was it.  I've done this on Pro Curves and 3Com's before, which is why my frustration.  I fear I'm missing something obvious.  Forcing negotiation on 90 some devices is going to take a boat load of time, but I'll give it a shot.
0
 
LVL 33

Expert Comment

by:paulmacd
ID: 34897364
I wouldn't specify a data rate on the ports or the NICs.  I think that will just be a waste of time and something you'll have to undo later.  

I'd point out that it's not impossible the switches are bad.  Are they running the lastest firmware?
0
 

Author Comment

by:urbanstyles
ID: 34897399
To be honest I haven't checked.  I guess unpacking new equipment and expecting it to work without upgrades isn't always possible.

I'll check the firmware.
0
 
LVL 4

Expert Comment

by:snowdog01
ID: 34897403
I would not worry about negotiation on the workstations, just your servers, other switches and routers.  The workstations should be fine. I beleive the key is to set this up off-line and get thing working before you try to deploy it to production.  Cabling can be an issue  if you trying to use Cat 4 for GB speed.  This could be an issue if the auto-negotiate is turn-off on that port.  Check out the port speed settings on your servers and make sure they match the new switch.  Your laptop should be able to function through the new switch without issues, if all is working as it should.  If you have a port available on the old switches, plug in the new one, with only your laptop on them so you can complete your setup and testing.
0
 
LVL 4

Expert Comment

by:snowdog01
ID: 34897432
Certainly take paulmacd's advice and make sure the new switches have the latest firmware/patches installed.  This should not prevent them from working, but you do want them as up to-date as possible.
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:urbanstyles
ID: 34897477
Once connected everything shows successfully linked at 1000M FD.. the wiring is most definitely cat5e throughout the building.  I suppose even though the connection SHOWS as properly auto negotiated, I'll try hard coding the Server NICS to make double sure.

It would appear from the activity lights ( which i know can be misleading) that a broadcast storm seems to be ocurring but that is just a thought.
0
 
LVL 4

Expert Comment

by:snowdog01
ID: 34897516
With only 60 users and a couple servers, I would not recommend using VLANS, unless the business is required for security reasons.  You should be fine with a flat network schema until such time as the business expands.
0
 

Author Comment

by:urbanstyles
ID: 34897542
I'm not using VLAN's.  I simply would like an aggregate link between the switches for increased bandwidth on an 'inter-switch' basis.  having 40+ devices on one switch restricted to a single gig link to the other swich that contains the servers may be limiting at some point.  One of our applications in particular is very inefficient in it's bandwidth use.
0
 
LVL 10

Expert Comment

by:TekServer
ID: 34898651
You know, a broadcast storm is not outside the realm of possibility; Paul mentioned that the switches themselves could have problems, and I agree - you could have a "chatty" port on one of the switches.

I would recommend that you configure a monitor port on one of the switches, connect a laptop to that switch, and install WireShark on that laptop.  Then monitor the network traffic with your servers and a few workstations connected, and see what's talking.

Relevant links:
Switch Manual (for info on configuring a monitor port, if you need it)
WireShark (free to download, in case you're not familiar with it)

hth
:)
0
 

Author Comment

by:urbanstyles
ID: 34898808
I'm not sure if I'm working my way to a solution or not, but I reconfigured the LAG, then enabled broadcast control in the ports involved and have slowly moved a few devices over from the old switches to the new ones.. So far so good.  Everyone is talking.  I'll have to wait until this evening to move the servers.. But at least it is looking promising.
0
 
LVL 33

Expert Comment

by:paulmacd
ID: 34898930
Hey, great!   Good luck.
0
 
LVL 32

Expert Comment

by:aleghart
ID: 34900134
Good that you're making progress.

I would have done this in a different order:

Upgrade firmware.
Stack the switches using dedicated stacking cable.
If no stacking cable, then use single patch cable for uplink.
Swap out switches.
Reconnect all.
Test for problems.

If you're working here, then it's easy to troubleshoot a LAG.  If you do it all at once, or in a random order, how do you know what's truly failing?

Important question:
Any reason you bought new switches without stacking cables?  If it's not too late, return them.

One 10Gb or 24Gb stacking cable is faster than sucking up ports to make 4x1Gb LAG.

The GS748TS can stack up to 6 tall, and uses a 20Gb (10Gb duplex) stacking cable.  Two ports on the back of each so you can cross-link or make a ring for redundancy.

http://www.netgear.com/service-provider/products/switches/stackable-smart-switches/GS748TS.aspx
0
 

Author Comment

by:urbanstyles
ID: 34902119
I've checked.. and the firmware is current.  These are the non-stacking switches.  I agree they would have been a better option, but the local vendor had these in stock and I purchased them before realizing that a stackable option was available.

What is it they say of hindsight?
0
 
LVL 32

Expert Comment

by:aleghart
ID: 34902186
Hindsight = swift returns with receipt in-hand.  If they'll take it, send it back.  No shame.

I've returned a lot of stuff and taken a hit on shipping and re-stock fees.  Cheaper in the long run to get the right equipment.

The stacking models are $200-300 more, but will be much easier to maintain.  If you're paying for Netgear tech support (highly recommended), then they'll spend less time troubleshooting and re-creating your LAG whenever you call.

CDW carries the GS748TS for $862.  Amazon for $865, and you can have them by Thursday with overnight shipping.  Check on the stacking cables...but you can always use 1-port uplink with a standard patch cable until stacking cables arrive.

There is a separate SKU for the extended warranty, but IIRC, you can add that in the first 30 days (double-check that info though).
0
 

Accepted Solution

by:
urbanstyles earned 0 total points
ID: 34962772
Apologies for taking some time in getting back to this.  Finding a time with no users on the system is difficult this time of year.  Turns out in my fatigue I had one cable that was looped back into the swtich causing the broadcast storm.  Eliminated that cable, plugged the rest in and voila, full gig performance.

I much appreciate the brainstorming and feedback, it definitely helped me arrive at the solution.

This site truly is great for that.
0
 

Author Closing Comment

by:urbanstyles
ID: 34995411
Was able to troubleshoot my own error in connections.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Article by: IanTh
Hi Guys After a whole weekend getting wake on lan over the internet working, I thought I would share the experience. Your firewall has to have a port forward for port 9 udp to your local broadcast x.x.x.255 but if that doesnt work, do it to a …
This is an article about my experiences with remote access to my clients (so that I may serve them) and eventually to my home office system via Radmin Remote Control. I have been using remote access for over 10 years and have been improving my metho…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now