Network Design needed - Resilient and Fast!
Posted on 2007-08-02
I'm upgrading my 10-year old network architecture, and could use some great advice! Here's what I've got now:
- 4 x HP Procurve 4000m 10/100 managed switches connected in a fully meshed topology. Each switch has a (1Gbps) mesh connection to each of the other 3 switches.
- The mesh function ties the switches together with multiple physical routes, giving (1) non-stop protection against link failure, and (2) higher bandwidth switch-switch.
- Two switches are located in the basement data center, and two are on the second floor serving desktops (I expect to add the first floor in the next few years).
- If a second floor switch fails, about half of the desktops are affected, and I can restore those by taking them off the failed switch, and plugging them into the survivor.
- If a data center switch fails, all critical servers are connected with redundant NICS to separate switches, and so they maintain their connections on the surviving switch.
- Everything is on one IP subnet, with no defined VLANs. There are about 100 connected systems.
- We have a single Internet connection to/from the LAN (2 T-1s bonded in a Cisco 3640 to a firewall box that's decent - though it doesn't route very well - to the LAN).
- All systems (web servers, etc.) reside "inside" on the single LAN.
- Because of the simple configuration, I can pretty much plug and unplug things anywhere on the network without trouble.
- It's *really* simple to manage, obviously.
What's driving the upgrade:
- VoIP deployment on the LAN in the next 90 days; so I want more LAN security and performance headroom.
- Rapid growth in the Internet-facing applications; so I want to get them out of the LAN, and make them more scalable.
- Get Gigabit connectivity, especially among the servers, but retain the same resilience (or better) that I have now (i.e. drop a switch, and keep running).
- Improve/enforce security by segmenting the network into LAN and a few DMZs.
- Create an infrastructure that will last for the next 10 years (assuming no huge growth spurts, but be able to roll with them if they occur).
- Keep it simple to manage.
My thoughts on how to solve (and I have a diagrams if anyone is interested): Some of this is conceptual, and some revolves around particular products to make an example. If I'm off-base on the concepts, then please correct me there. I don't want to get off discussing this product vs. that if the picture is wrong to start.
A. Create a topology where a capable *redundant* routing firewall (a Juniper SSG140 is in my mind here) controls access to and from
(1) the Internet (eventually with redundant links);
(2) an "Internet" DMZ, where the web servers, DNS, e-mail proxy, etc. reside;
(3) an "Intranet" DMZ, where company apps, VoIP proxy, VPN access, etc. reside;
(4) the internal LAN, and possibly
(5) a couple of other subnets for server/device management, sandbox systems, etc. These could possibly be set up as VLANs riding on the internal LAN's gear instead.
B. Upgrade the LAN switches to Gigabit. At least for the data center, and preferably for all of them (management asks why can't we get gig to the desk for all this money?). For longevity, I'd look for the highest switch bandwidth/lowest latency for the buck. Issues:
(1) I still see the internal LAN as essentially one subnet (may grow to 200 servers/desktops). I'd love to set them up meshed as now, but in the HP line, at least, they've restricted meshing to the high-end switches (e.e. 5400zl @ $100+/port). I might swing this in the end, but I'm also paying for a lot of functions I'm not sure I'll ever use. On the other hand, they have a lot of bandwidth...
(2) Interconnects are expensive. It's great to have fast switches, but how to connect them? 10GbE is out of the price ballpark, so best I can do is trunk 1Gb links. Again, meshing helps here (though with lots of ports consumed).
(3) If I go with a non-mesh setup (e.e. using 4200vl switches), how does my fault-tolerance fare? In other words, I could see two switches in the data center with trunked links between them for a high-speed core. Then, a set of trunks from one DC switch to one 2nd floor closet switch, and another set of trunks from the 2nd DC switch to the other 2nd floor switch. I lose something here, since a DC switch failure also kills a second floor switch. Do you combine all this with spanning trees for link redundancy? Yuck, seems complex and error-prone. And, how do I connect my redundant server-NICs to the DC switches in this case so that they behave as nicely as they do now? I'm really unclear on this.
So, that's it in a nutshell. There are obviously many more questions to be answered, but I think this covers the big issues, and I wanted to get it out there. I have not been up on all the latest products the last few years, and things have exploded. There seem to be lots of functional overlap in routers, switches, and firewalls, so how to get what you need without wasting a bunch? I've been wrestling with this for a week reading up on current products and design ideas, but then thought I should turn it over to the guys with real-world experience. I'd love to hear from you! Thanks!!