Avatar of denver218
denver218
Flag for United States of America asked on

Cisco Core Switching design

I am in the process of going over the switching at a datacenter I setup a few years ago.  Below you will see a diagram showing what the network currently looks like.  This is a small datacenter that is starting to grow pretty rapidly, so I want to make sure I am on the right track for expanding.  I have two 3550-12G's, these are my core switches, the second 12G is there for redundancy.  I have three networks at this datacenter that are represented in the diagram.  The 10.4.0.0/16 network is the network I am concerned about as it is rapidly expanding.  As you will see all switches share a common vlan (vlan 20), and are on the same VTP domain.  If you look at the core switches, the first route statement goes to the firewall and the other route statements go to the network switches.  As I said before my concern is on the 10.4.0.0/16 network where I am about to add a third switch.  The core switch has a route statement to 192.168.12.40 which is the vlan 20 IP Address on the first switch on the 10.4.0.0/16 network.  If I expand this network to say 5 switches, this means that all traffic will go through that first switch.  I am concerted about this switch becoming over utilized.  Any thoughts?network diagram
Switches / HubsRoutersNetwork Architecture

Avatar of undefined
Last Comment
harbor235

8/22/2022 - Mon
mat1458

I would change your design to a simpler layer 2 design. If you are concerned about the load on the core switches you may distribute the load by setting different root bridges for each vlan (40 on core sw left; 50 and 60 on core sw right). I further would distribute all vlan to all switches to enhance flexibility. And i'd attach all switches directly to the core switches (triangle design). I'd let the ASA do the IP termination for all VLAN.

Not knowing your communication patterns it is hard to say if performance could be an issue. If it really is: did you consider to buy more recent hardware that allows for Etherchannel (bundling multiple 1Gb interfaces) or 10Gb interfaces? All your switches are considered End-of-Life by Cisco (http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps646/prod_end-of-life_notice0900aecd8029f777.html)
denver218

ASKER
I am stuck with this hardware for now, already tried to get new, but it wasn't approved.  I am concerned about switch 10.4.0.1 being overloaded if I add a few more switches.  If you look at the core switch I have the following route statement:

ip route 10.4.0.0 255.255.0.0 192.168.12.40 (192.168.12.40 is the VLAN 20 IP on this switch)
 
Having a setup like this means that all traffic has to pass through this switch first right (10.4.0.1)?  If I had 4 or 5 switches on this subnet, that may be to much traffic for 10.4.0.1 switch to handle.
mat1458

Do you have any statistics that show you that the switch is overloaded? Actually a switch is a switch and it normally switches at wirespeed as long as you don't try to add fancy features to the ASICs. If you don't have too much systems that simultaneously need the available uplink bandwidth you should be ok. (But as I said before, without knowing the traffic patterns it is hard to say.)

If much of the communication is towards/from the internet you won't have a problem since your internet link is probably less than 1Gbps. If you have lots of servers that the users connect then you might have issues.

As long as all the users are in one VLAN you can't do anything against the one switch being used as transit. If your clients are all on DHCP you might think of introducing a second VLAN 41 for the users that has the root bridge on the other switch. With that you already double the capacity of the client access bandwidth. And again, if you distribute the VLAN among all switches you might gain some more capacity on the uplinks.
Your help has saved me hundreds of hours of internet surfing.
fblack61
denver218

ASKER
I do have switch 10.4.0.1 being monitored using solarwinds orion.  Memory and CPU are fine, and the uplink port that goes to the 3550-12G isn't being over utilized, so maybe they way I am doing it is fine until I get new hardware. I will just keep monitoring and worry about it once I see heavy utilization.  I will be adding two more switches to the 10.4.0.0/16 network making a total of four switches later this week.

The datacenter is used mostly for hosting applications.  For example we host users exchange mailboxes, we host some users business applications.  We have some users who connect using Cisco VPN client to access their apps, some use citrix to access their apps, etc.  

Right now in production I currently have only the two switches you see on the 10.4.0.0/16 network, I will be adding two more this week.  The IP Address of the current two switches in this network are 10.4.0.1 and 10.4.0.2.  Now right now all servers regardless of which of the two switches they plug into use 10.4.0.1 as the default gateway.  Is this good practice or should I use, for example, 10.4.0.2 as the gateway for the servers if the server is plugged into that switch.  I don't want to use the core switch IP for the gateway (192.168.12.200) because the core switches are there for redundancy, if the servers have 192.168.12.200 as the gateway and that core switch fails, it won't know to use 192.168.12.201 (redundant core switch.  Thanks.
harbor235

3550s are robust switches and can handle a good amount of traffic. I would try to use what you call a core switch as a distribution switch. Load balance your vlans to primary and secondary distribution switches balancing load and exit points. The core switch is really the 3550 that connects to the ASA. A dual homed strategy will also help balance the traffic.

Question 1, why are you using a /16 for 10.4.0.0?
Question 2, does vlan 40 exist at the core switches?
Question 3, are you using virtualization?

I would move the layer3 interfaces  to the core/distro, add separate connections
from each current vlan 40 switch to each core/distro switch, establish multiple spanning tree doamins, one core to distro and several distro to access. Remove the new core switch from vlan 20 completing the isolation of internet facing spanning-tree with customer services facing spanning tree.

This would scale better allowing for additional switches to be added easily, it also minimizes your layer two hops. The choke point will be the new core switch, you will need two, and you need dual firewall and routers. This solution provides you more scale,
virtualization excluded.


harbor235 ;}
denver218

ASKER
Question 1, why are you using a /16 for 10.4.0.0?
/16 network was not not setup by me, I know it is not realistic to have a /16 network segment.  I won't allow that network segment to get to big.  I could change it as well at some point I have down time .

Question 2, does vlan 40 exist at the core switches?

All switches share a common vlan, which is vlan 20 (192.168.12.0/24)  All switches have VTP configured.  If I do a show vlan on any of the switches I can see all vlans (Vlan 20, Vlan 40, Vlan 50, and Vlan 60)

Question 3, are you using virtualization?
Yes we are using some virtualization (VMWare, Hyper-V)
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
harbor235

I would move in the direction I suggested, it breaks up your layer2 domains adding potential availability or reducing your fault domain. I am advocating a classic core distribution access model, Layer two access to distribution, layer 3 distribution to core. This design was the defacto DC design until virtualization appeared and now a flat Layer 2 architecture is required for scaling vmotion.

Layer 3 termination on the 3550-12Gs allows you to balance your vlans and add additional switches easily for increased port density.

If you scale this design your choke points will be at the edge, additional redesign of security and load balancing (if needed) will needed to continue to scale or you will need a much larger or data center class device. But the architecture will help you scale.

harbor235 ;}
denver218

ASKER
Thanks Harbor235.  Are you saying I should assign the 3550-12 an IP address in the vlan 40 range.  Then trunk the network switches to it?  At this point I would use the IP address of the 3550-12G as the gateway for all servers.  What would happen though if the the 3550-12 failed?  This will be the gateway of all the servers if I am understanding correctly? Could you maybe provide a quick example.  Thanks.
mat1458

You have to look at the issues you want to solve and then decide for a desgn that helps to solve them.

1. too much load on core switch left
2. loadbalancing on the core switches
3. overbooking of uplinks to core
4. resiliency of core switches
5. best practices design

answers:
1. too much load on one switch can be trying to split traffic among the two core switches. in your case there are multiple vlan, splitting them can help to bring the load down
2. see above
3. distribute the vlan ports to more switches so you have more uplink capacity. if it makes sense geographically use all VLAN on all switches
4. resiliency of core switches can be acheived through HSRP on the VLAN interfaces that are then only on the core switches. Get rid of the L3 design to the access. It was popular when the 3550 showed up first, it's not anymore in environments like yours
5. L3 core/distribution, L2 access

HTH
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
harbor235

Yes, but I would also run HSRP between primary and secondary for all vlans. Make 3550-12-1 primary (later2 and layer3) for vlans 40 and 60, secondary for 20, and 50. 355012-2 would be primary for 20 and 50, secondary for 40 and 60. You accomplish this by setting root bridge for vlans appropriately and by configuring HSRP VIPs appropriately as well.

This way you distribute the load outbound, new vlans are added by balncing them over teh two switches which are now active active, not active backup. If one fails you have a backup because the other switch is secondary root bridge and backup for HSRP, you see?

Your choke points will be at the edge where you will need additional devices or a data center class device.

harbor235 ;}
mat1458

Be aware that when using HSRP you have to put the root bridge on the opposite core switch where your primary HSRP peer is. Otherwise you end up in flooding scenarios which impact your performance.
denver218

ASKER
Thanks guys, I really appreciate your help so far.  I did some research on HSRP, does the below configuration look good between the two 3550-12G's?

Primary 3550-12G

interface vlan20
 ip address 192.168.12.201 255.255.255.0
 standby 1 ip 192.168.12.200
 standby 1 priority 110
 standby 1 preempt

interface Vlan40
 ip address 10.4.0.2 255.255.0.0
 standby 1 ip 10.4.0.1
 standby 1 priority 110
 standby 1 preempt

interface Vlan50
 ip address 10.5.0.2 255.255.254.0
 standby 1 ip 10.5.0.1
 standby 1 priority 110
 standby 1 preempt

interface Vlan60
 ip address 10.6.0.2 255.255.254.0
 standby 1 ip 10.6.0.1
 standby 1 priority 110
 standby 1 preempt



Secondary 3550-12G

interface vlan20
 ip address 192.168.12.203 255.255.255.0
 standby 1 ip 192.168.12.200

interface Vlan40
 ip address 10.4.0.3 255.255.0.0
 standby 1 ip 10.4.0.1

interface Vlan50
 ip address 10.5.0.3 255.255.254.0
 standby 1 ip 10.5.0.1

interface Vlan60
 ip address 10.6.0.3 255.255.254.0
 standby 1 ip 10.6.0.1
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
harbor235

The idea is to balance the traffic load, so default hsrp priority =100, 110 means active, 100 is secondary. So you need to split the load which means 2 of the four on switch 1 would use 110 and 2 on switch 1 would use 100, same for switch 2, you see?

You also need hsrp group ids and additional config as well, I recommend passwords and fail back priority.

harbor235 ;}
denver218

ASKER
My mistake, I will change that so the load is balanced.  I meant to do that, but it slipped my mind.  Could you give me an example of configuring the hsrp group ids?
harbor235

Remember to also mirror root and backup root bridges with hsrp active and secondaries.
We want efficient traffic flow in and out.

Not sure about your inbound traffic architecture, typically you would nail down a layer3 pathway to ensure proper balancing as well.

harbor235 ;}
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
denver218

ASKER
harbor235, can you please provide an example of configuring the group ids and how to mirror the root and backup root bridges with hsrp active and secondaries.  I appreciate all your assistance.
ASKER CERTIFIED SOLUTION
mat1458

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
harbor235

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
denver218

ASKER
The upliink port on both 3550-12G's is gi0/1.  This goes to the switch sits behind the firewall.

So you are saying on should track that port?  So in each vlan I should add the following:
standby track gigabitethernet 0/1

Now the 3550-12G's connect to each other via gigabitethernet 0/11 on both switches.  How do I track that connection?  Thanks.
mat1458

In my opinion link failures are detected by STP/RSTP so there is no need for interface tracking in HSRP. You only add more complexity letting two mechanisms deal out the same failure. HSRP only should help around total outages of one of the core switches.

The protection of the rooot bridge in my opinion can be done with root-guard. If I have any numerical value that I can set myself I hardly push it to the limits so I still have the possibility to take a higher value if I need to. The settings for the root/secondary have to be multiples of 4096 in PVST/RPVST that you probably have in place.

The 10 seconds delay for the preemption indeed might be a bit long. you might want to set this according to the service levels that you have in your organisation. For the reload delay you might measure the actual time for a reload of one of your core switches. As I don't have a 3550 available the number was an assumption.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
harbor235

As far as tracking goes I assumed the uplinks were layer 3 links, I looked at it again and they are not, I agree with mat1458

Root bridge protection is personal preference, I would nail it down, we are not talking about a enterprise design but datacenter here,


harbor235 ;}
mat1458

Cool with me as well, no offense meant. It's really a matter of taste.
denver218

ASKER
Thanks for your help and the configurations.  I am off to the right start.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
harbor235

mat1458, no offense taken, good job

harbor235 ;}