asked on

Clustering

When clustering windows server 2003 so that several servers can reply to tcp requests on the same IP, how is that handled? IE does the software that is on the server determine whether to pickup the tcp request or is that done at the Operating system level, leaving the ability for any tcp application to run?

ASKER CERTIFIED SOLUTION

LauraEHunterMVP

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

sk33v3

ASKER

Ok, with NLB if one of the servers does fail, what happens? does the query fail or does the next server in the chain pick up the request?

LauraEHunterMVP

Yes, NLB in Windows Server 2003 will detect failed nodes as well as nodes that are added or removed from the NLB. (That might not have been the case with Windows 2000, but it's been awhile so I can't remember with certainty.)

sk33v3

ASKER

Your post is a little vague, you mentioned that it does detect failed nodes, does it then forward the information to the next node? Also, on a tcp connection, will the connection stay established to an individual server that it started with, or is it possible it will be routed to a second server for a single tcp connection?

LauraEHunterMVP

Perhaps I misunderstood your question then, I apologize. Are you asking what happens if a host fails while a client is in the middle of a connection? IE, ClientA gets routed to ServerB, and ServerB fails while ClientA is browsing the site/using the application/whatever? In this case NLB will not initiate any form of failover - ClientA will just receive a "You lost your connection" error of some sort.

The failure detection mechanism is more to do with a convergence process that takes place whenever nodes join or leave an NLB - NLB will detect the added or removed node and update its table of available hosts that will be able to accept new connections.

Your best technical reference for NLB is going to be found here: http://technet2.microsoft.com/WindowsServer/en/library/c1db8c13-da31-4541-81d8-e2b3ebe742fb1033.mspx?mfr=true - this references the actual DLLs and algorithms that are in use.

HTH.

sk33v3

ASKER

Let me give you an example.

let's say Servers 1-10 are setup to respond on 10.10.10.5 using NLB

When client application tries to connect to 10.10.10.5 the next server in the list is server 4, however server 4 did not respond to the request, does the NLB take the request and route it to server 5? or does the connection just fail?

Next example
If a client application has established a tcp connection with server 5, will all of the packets for that tcp connection always be routed to server 5? I would assume so but want to verify that.

LauraEHunterMVP

In example 1, if the NLB still thinks that server 4 is "alive", it will continue to route client requests to it until it actually detects the failure - I don't know the exact metrics of the "detect-alive" algorithm (and it would be partially dependent on your hardware and bandwidth usage -anyway-), but there will definitely be some amount of convergence time before NLB realizes that server 4 has failed, during which time client requests are still being routed to server 4 and will be erroring out.

The answer to your example 2 will depend on how stateful the application is, since each new connection request is re-evaluated by NLB and might be sent to a different server in the NLB. For this reason Microsoft specifically recommends that NLB only be used for predominately stateless applications, as stateful applications will break if NLB decides to route part of the conversation to a different node than the one on which the transaction was begun. So if I'm browsing to a static web site hosted by an NLB, I might be getting the HTML for the landing page from node 1, the source of a followed HREF from node 2, and perhaps even a jpeg file -within- that page from yet another.

sk33v3

ASKER

Ok, Sorry, just one more example I think.

I have an application, lets just say its the firebird database server running on 4 machines. The Firebird server requires you to login, the connection is a TCP connection, Would a Firebird database connection possibly have problems? From my understanding of your explanation of example 2 from above, each packet on a tcp connection could be routed to a different server within the load balancing network. Is this correct?

LauraEHunterMVP

Databases as a rule assume a stateful connection between client and server in order to maintain data integrity - remember the ACID principles from your undergrad databases class, where each transaction needs to maintain Atomicity, Consistency, Isolation, & Durability. (Can't believe I still remember that.)

If you're looking at creating a high-availability solution for a database, then you will probably want to move back to the original solution of clustering, in which you have multiple physical boxes using a shared storage medium (typicall a SAN). The major difference here is that, while an NLB will have multiple resources all online at once and all answering requests in a round-robin manner, clustering typically relies on an "Active/Passive" model where only one node is actually online and responding to requests, while the other node(s) are in "ready to spring into action" mode to take over responding to requests should the active node fail. Because there's one consistent resource responding to requests (rather than an unpredictable round-robin list of hosts in an NLB), clustering is better-suited to state-heavy applications like databases, Exchange, etc.

A good example would be to think NLB for your front-end web servers and clustering for your back-end databases. See the following for more: http://technet2.microsoft.com/WindowsServer/en/library/c35dd48b-4fbc-4eee-8e5c-2a9a35cf63b21033.mspx?mfr=true

sk33v3

ASKER

Ok so for the last example would Component Load balancing be what I need? From my reading of that it sounds like it would handle the above situation perfectly.

LauraEHunterMVP

I'm not specifically familiar with CLB, as it used to be an add-on component (it was called Microsoft Application Manager 2000 or something) and I've never had an occasion to implement it.

As I understand it, CLB is used to balance clusters of servers that activate COM+ components; the CLB software is responsible for determining the order in which COM+ cluster members are used for activating those component. I'm not a developer (and I don't even play one on TV), so I'm not sure if that will be applicable to your situation or not.

LauraEHunterMVP

I'd say I answered it rather comprehensively, but that's just me. :-)

sk33v3

ASKER

I would say you answered it perfectly. Thanks and sorry for the delay.