Setting up a Citrix Netscaler Load Balancer HA pair

Before I go to far, let's explain HA (High Availability) and why you should consider it.  High availability is the mechanism used to provide redundancy to any service at the same site and appears as a single service to the users of that service.  As opposed to DR (Disaster Recovery) which provides a mechanism used to provide redundancy to any service at disparate sites and are mirrors of the same services.  HA is needed incase of component failure.  In the case of this article, we are looking at a netscaler failure, an uplink failure the netscaler depends on, a hard drive failure, or any other failure that may make a netscaler inoperable.  In this case, having a HA pair a second netscaler that stands "idle" will detect that its peer has failed and take over the load balancing duties automatically.  By doing this, it ensures higher availability to the users (hence the HA name) and minimizes impact.  In turn it gives you as the admin breathing room to repair the inoperable node without everyone on you to get services working again.  I hope this explains, although briefly, what HA is and why using HA with a netscaler would be beneficial to you.

When setting up a netscaler HA pair, there are a few things to consider:

1) you don't have to have the devices directly linked to one another like some devices might need to be for HA to work
2) hardware platforms must be identical.  Meaning you can't have a Netscaler 22500 and a 11500 be peers
3) It is not 100% necessary to have the same version on both netscalers though for obvious reasons it would be a mistake to run long that way past doing a "rolling upgrade".

There are a few steps to take when setting up the pair:
1) Determine the physical configuration of the netscalers
2) Determine the logical configuration of the netscalers
3) Determine version to be using
4) Assigning IPs to netscalers and which type of IPs to use
5) How to create the HA pair and validate it is healthy
6) EXTRA: how HA works
7) EXTRA: config examples

Step 1 - Determine physical configuration of the netscalers
This can be easy or more complex, but it wholly depends on your architecture and your needs since there are typically many interfaces and types of interfaces.  

Mgmt interfaces - These are typically numbered with a leading zero; e.g. 0/1, 0/2.  We will use 0/1 in our examples.  The purpose of mgmt interfaces is for anything mgmt data related (obviously right?)  This means when you SSH or use the Web GUI to change the configuration, you use the mgmt interfaces as that is where the IPs should be bound.  This also goes for traffic that originates from the netscaler that is of mgmt concern; e.g. syslog, ntp, dns, snmp trap, etc.

Data interfaces - These are typically numbered with a non-zero; e.g. 1/1, 1/2, 10/2.  The 1 will denote a 1GB interface whereas the 10 will denote a 10GB interface.  Often times these interfaces will be put into a LACP configuration that can provide a "single" pipe to use for the data as well allowing the netscaler to take more advantage of its bandwidth capabilities more easily.  The purpose of these interfaces is to handle any user data.  This encompasses any traffic coming from the client to the load balancer and any traffic originating from the load balancer going to the servers that host the applications being load balanced.

Console interfaces - There is generally only a single serial console port and this is used for OOB access.  Just like when you console to a cisco router/switch, you have full access to the system.  This is a great way to maintain access to the system if the network has failed in a way you can't manage the device via the mgmt interfaces

LOM interfaces - There is also generally only a single LOM (lights out management) port and is also used for OOB access.  However this interface's purpose is much more focused.  typically its just used to reboot the box or other low level operations if the device has an error and even having access to the console port doesn't help due to the failure that is needing to be addressed

Step 2 - Determine the logical configuration of the netscalers
This goes to the architecture of your load balancing environment as well but from a data flow perspective.

There are effectively 4 main architectures that I'm aware of at this time: server vlan attached, vip/snip, gateway and dsr. Each has their benefits and their drawbacks but it depends upon what your needs are as well as the complexity level you are willing to accept as to which you will choose.

The server vlan is the only one where is it acceptable (at least the way I look at it) to have the mgmt vlan on the same interface as the data vlan.  In this case you wouldn't be using the 0/1 interface at all that is typically designated as the mgmt interface.  All traffic, mgmt and data, would go over the data interfaces.  Though you could put it all over the 0/1 interface as there is no real restriction, depending on your needs the bandwidth capability is just not there.  Also, everything would be processed via the mgmt CPU which could limit performance as well.  With that said, it is recommended to always have the mgmt vlan on the mgmt interface and the data vlans on the data interfaces.  But if you have a small network and adding a mgmt vlan is not worth the effort, know you can run it over the data interface and be just fine.

Architecture does not impact the decision of how to use the console/lom ports.  You either need them or you don't.

Server vlan - This means that the netscaler will "live" in the same vlan/subnet as the servers.  Also, the VIPs that clients access will be in the same vlan as well.  It is named this because the vlan that the servers actually lives will be bound to the netscaler.  For example, you have a small network with a single vlan for data (  You would give the netscaler a SNIP (to be explained below) in that vlan, as well as assign a VIP address in that vlan.  So the user would connect to the VIP address.  As a reverse proxy, the netscaler would effectively terminate the connection, receive the request for any processing it needs to do, and once it determines which server to relay the request to, would open a new connection using the SNIP as the source IP to connect to the server.

The benefits of this method is that you don't have to worry about layer 3 devices between the load balancer and the server, simplifying troubleshooting dramatically; no ACLs, no routing, etc.  The drawback is that if the load balancer is compromised it has full access to all servers.  Also, if there are a lot of servers, the vlan being used could get full and you have to start adding more vlans.  Finally, if someone turns up a server without verifying the IP is free first, they could assign the same IP as a VIP, the SNIP, etc. and possibly cause a major outage for you.  This is why it is only recommended to use this approach if you have a small network or you need a very simple load balancing architecture.

VIP-SNIP vlan - This is a little oddly named but it is due to the way the data flow is done and the purpose of the vlans that are used.  On the data interface you would always bind at least 2 vlans: VIP and SNIP.  As you will find out below SNIP refers to a type of address which is why this name can be confusing.  Just know that the VIP vlan's purpose is to house only VIP addresses that clients connect to, no servers, etc.  The SNIP vlan's purpose is to house only the SNIP addresses that the load balancer uses to talk to the servers who's applications are being load balanced.  For example, You have a VIP vlan of and a SNIP vlan of  First off, why is the SNIP vlan so small?  This is because you never need many SNIPs, often only one even, to talk to a large number of servers on the backend.  So there is no need to take an entire /24 when you only need one or two addresses in most any environment.  So the client would connect to the VIP address in the VIP vlan and the load balancer would use the SNIP address that exists in the SNIP vlan to talk to the servers.  This is the same data flow as the server vlan method except that vlans are dedicated to a purpose and the load balancer doesn't share a vlan with any servers.

The benefits of this method mainly come with scale and security.  You save address space by only needing a /29 for your SNIP side of your netscaler and you can add more VIP vlans whenever necessary.  By not sharing a vlan with your servers, you will have to contend with layer 3 devices and even possibly ACLs.  However you don't have to worry about people assigning IPs that are VIPs or SNIPs to other hosts causing outages.  Finally, if the load balancer is compromised, the servers should be safe as generally there should be ACLs only allowing what is needed not to mention if a server gets compromised they can't access the load balancer.  This is recommended for larger environments that need to scale more easily, require more protection of the load balancing environment, and require easier ability to audit data flow if SOX, PCI, etc. are needed to be complied with.

Gateway - This is a very rare case that should only be used when needed.  This is in the case where DSR (explained next) can't be used but the servers need to know the client IP address.  So the servers will actually use the netscaler as its default gateway and the netscaler will act like a router instead of simply a load balancer.  So the VIPs will be on the "front" side of the load balancer and the SNIP will be on the "back side" which will double as the default gateway for the servers.

The benefits of this method is that your server applications can get the client IP of the real client even when the server application is not able to be modified in any way to allow that retrieval via another method.  The cons of this method is that it is now a router and more precaution must be taken when doing any changes because the load balancer doesn't just balance traffic from clients to the VIPs but is the gateway for the servers to everything else; even stuff not on the load balancer.  The other major con is if you have to setup static routes on the load balancer.  Static routes on the load balancer are used primarily when traffic is initiated from the load balancer.  For example, lets say that server talks to a VIP on the LB and that VIP lives in a vlan bound to 1/1 and we have a route for saying to send traffic out 0/1.  Due to MBF (mac based forwarding), when the LB responds, the return traffic will always go back out the 1/1 interface because that is where had come in from.  Now lets say that the LB wants to initiate communication to  It will go out the 0/1 interface because that's what the routing table says to do.  The same goes when in gateway method and the servers on the "back" side of the load balancer intiate communication to a server passed the "front" side of the LB.  So if the backend server wants to talk to, it will also go thru the 0/1 interface.  The reason this can be a bad thing is that depending how the network is architected it can be confusing as to where ACLs need to be applied to ensure the backend servers can talk to hosts it needs to.  So in this case routes should be as specific as possible and if there is any overlap (e.g. ntp, snmp trap, syslog, etc.) in hosts that need to be talked to by the backend servers and the LB, then make sure to take note that ACLs may need to be applied in areas that may otherwise seem non-obvious.

DSR (Direct Server Return) - This is also a rare case but not as rare as the Gateway method.  Again, this is needed if the client IP is needed by the backend server and needs to get it from the IP header as the application protocol doesn't support receiving it in a HTTP header.  This method can be a little confusing for many people though.  Just think about the name though: Direct Server Return.  This means the server (backend server) is directly returning the response to the client, not to the load balancer which would relay it to the client.  So the dataflow would go as such.  Client talks to the VIP which is on the LB.  The LB keeps the IP headers and source mac header intact.  It will load balance by changing the destination mac address that it finds thru using ARP for each backend server.  It then relays the packet to the server and when the server responds it uses the source IP/source mac it saw before which correlate to the real client/default gateway.  So the load balancer will see all traffic coming from the client but none coming from the server.  The other caveat of this method is that it requires configuration on the server itself as well.  A non-arping IP address must be configured on the server that matches that of the VIP address the client uses to connect to the application.  Note, it MUST be non-arping.  What does this mean?  Well think about what happens when you have 2 hosts in the same subnet claiming to have the same IP address.  In order to pass on a packet within a subnet, you have to know the mac address for where to send the frame and that mac address is determined by ARP when you know the IP address.  So when you do an ARP request for a given IP and 2 hosts respond, what do you do?  You can potentially pick the wrong one.  So in this case, the LB can be the only one to respond to those arps to ensure that  traffic is load balanced as needed.  But why does the server even need the IP you're asking.  Its because of how TCP/IP works when receiving and processing PDUs.  When the backend server receives the frame, it sees that the destination mac address is its own and thus will decapsulate the packet and further process the data.  At that time, it'll see the destination IP of the VIP.  Since ordinarily that IP wouldn't be assigned to the server it would drop the packet.  But we need it to process it, so the server must be configured with that address to ensure it processes those packets.  In addition, because that IP was used to process the packet, it will be used when sending the reply, which in turn guarnatees that the client will be able to match up the reply within its TCP/IP session table and process it correctly as well.

The major benefit of this method is that it keeps the load balancer out of the middle of the server and everything else it must communicate to.  This is the one benefit it truly has over gateway method.  So you don't have to worry about routes and ACL complexity but you still get the client IP to the application that needs it.

Step 3 - Determine version to be using

This step is semi-optional as you can do upgrades or even downgrades post HA setup as well.  However you must remember that version, license, hardware, etc. must be the same.  I say must even though you don't have to have the version and license the same but it'll clearly cause issues if they are not the same.  Hardware must be the same.

What version you want is purely up to use.  I just recommend getting them to the needed version prior to doing any configuration of the netscalers.  Think of it like this.  Would you install Windows 2000, configure it and then upgrade to Windows 2008 or would you go to Win2k8 first and then configure?

Step 4 - Assigning IPs to netscalers and which types of IPs to use
Assigning IPs to netscalers depends heavily on your architecture you use as it determines how many IPs would need to be configured.  However first you need to understand the types of addresses that can be configured: NSIP, SNIP, MIP, VIP, and RIP.

NSIP (Netscaler IP) - This is for the directly assigned IP to each netscaler.  Each netscaler will have one of these no matter if they are primary or secondary.  This is what is the default used for all mgmt activities whether accessing it via SSH or HTTPS or the device sending mgmt data via ntp, dns, syslog, traps, etc. NSIP configuration does not sync to the other node.

SNIP (Subnet IP) - This is generally used to communicate to the backend servers that the load balancing is balancing traffic to.  There should be one SNIP per vlan that is assigned to the pair.  SNIPs are "floating" IPs so they will only ever exist/work on the current primary node.  This is used for data traffic purposes and their configuration will sync from primary to secondary so you only need to configure them once (on the primary).

MIP (Mapped IP) - This is effectively the "legacy SNIP".  There is no need to use them anymore really.  However if you feel the need, MIP will be used instead of a SNIP if either USNIP is turned off, the MIP lives in a vlan that is directly attached to the server vlan and a SNIP doesn't, or if the vlan that the MIP lives in is also used as the default route.  However, I would recommend not using these as there is no real benefit and they do seem to be legacy.

VIP (Virtual IP) - This is obviously the IP address that clients will connect to.  They exist in so much as they will are proxy arp addresses and traffic will be accepted by the LB when it sees it as the destination, but it will never initiate traffic from these IPs.  The VIP configs also sync from primary to secondary so no need to configure on both nodes.

RIP (Real IP) - These addresses are configured as servers; e.g. "add server temp".  They are the real IP of the application server that traffic will eventually terminate to for processing.  Server configuration also sync's from primary to secondary so, again, no need to configure on both nodes.

When configuring addresses and selecting which and how many you need, the rule of thumb is below:
NSIP - 1 address is to be configured on each netscaler
SNIP - 1 address per vlan is to be configured on the primary only (config will sync)
MIP - none (read above, use SNIP instead)
VIP - as many as you need as long as there are free addresses to be configured on primary only (config will sync)
RIP - as many as you need to be configured on primary only (config will sync)

So if you are using the server vlan approach with a single vlan for mgmt and data, you'll have 2 NSIPs, 1 SNIP and as many VIP/RIP addresses as required.  If you have your mgmt and data vlans split out, then you'd have 2 NSIPs, 2 SNIPs and as may VIP/RIP addresses as required.

Step 5 - How to create the HA pair and validate it is healthy
Creating an HA pair is actually very simple and very few steps:
1) Pre-req: Ensure devices are physically connected, powered on, and all network configuration is correct as demanded by chosen architecture
2) A newly powered on netscaler will need to be connected to by the console port as no IP addresses are configured by default
3) First you will configure the device that will start as your primary.  When you power it up, it will try to step you thru a configuration wizard.  When going thru the wizard its just the basics: name, IP address, etc.  You just are configuring the NSIP at this point is what we're concerned about
4) After the NSIP is configured, you do the HA configuration and make sure to setup the peer IP address which is the NSIP of the other node that wil be part of the HA pair.  At this point, you are done with the primary.
5) Perform step 3 and 4 on the other node and you should now see the HA health as good (at least that they know the status of each other)  Use 'sh ha node' to see that HA status

At this point you'll have each netscaler configured with an NSIP and with the peer IP of the other device which will cause heartbeats to be sent between then communicating the status.

EXTRA: How HA works
HA is kind of a tricky thing to understand to be perfectly honest due to some of the terminology that gets used.  From a high level though, each node sends the other node heartbeat messages out all interfaces to the other node.  When the other node receives the messages, it keeps track of which interface they were received on to know if there is any failure with a particular path.

From a technical perspective though, it can be a pain.  First off, you need to understand how the heartbeats (HBs) are sent.  Since HBs are sent over all interfaces, each node needs to know which mac to send the message to and thus must do an ARP request.  Oddly enough, the ARP request is always for the NSIP address.  Why is that odd? because it doesn't matter if the request is going over the data interface or the mgmt interface (where the NSIP would typically actually live), it will use the NSIP address in the ARP request.  And this does actually work.  The reason is that a vlan is just a container really; a logical divider in the network.  So when the frame has a certain vlan tag, the ARP message knows nothing about it.  So the peer node will see the request, see it in the correct vlan for acceptance and see it is the broadcast mac address and thus process it.  It will then see the request for its NSIP address.  It will respond with the mac address assigned to the interface that it received the request on allowing the peer to know where to send the HB.  Now why did the peer respond with the mac of the interface it received it on instead of the mgmt interface every time?  Well, it has to do with the fact that while you can bind SNIPs to vlans and vlans to interfaces, and you can bind the vlan the NSIP lives in to the 0/1 interface, you can't bind the NSIP to that vlan.  By default, a netscaler doesn't bind an IP to an interface like we're used to with servers or network gear.  If you don't bind vlans to interfaces even, the netscaler will actually use any IP out any interface (confusing right).  But this is why you can't bind the NSIP to the vlan you bound to 0/1 mgmt interface or any vlan that is bound to an interface.  If you want to accomplish something like that you need to use nsvlan.  However as mentioned earlier, note you'll only get HBs over a single interface then which may prevent you from seeing a possible symptom of an issue that could cause a complete outage when a failover occurs.  Confused yet?  :)  Ya I was too when I first looked into HBs, but wait there's more.

So far, we know that HBs are sent out all interfaces and ARP is used by looking for the NSIP out all of the interfaces in order to know which mac to send the HB to.  However, remember we have vlans to contend with as well.  We have to make sure the HB is sent out into the correct vlan container.  0/1 is mgmt and thus is generally an access mode port so no vlans to contend with, but what about the data side where you can bind multiple vlans?  There are 2 major items to look for: tagall and native vlan.  HBs are only ever sent over the native vlan.  In 802.1q language, this is the vlan id that is assumed when no tag exists in the frame.  So whatever vlan you set as the "untagged" vlan on the netscaler, that is the one used to send out HBs on.  The problem is that there is no tag so your switch must agree on the native vlan as well.  What if you follow best practices and the native vlan is not used; everything must be tagged.  That is where you must configured tagall to be on.  This tells the netscaler that while you have set a native vlan (so it knows which vlan to send HBs over), from an 802.1q perspective tag every frame it sends.

EXTRA: example configs
For this configuration, I will assume the following:
Method - VIP/SNIP
Interfaces used - Mgmt (0/1) and Data (1/1)
Network - requires everything tagged
VLANs - (vlan 100 - mgmt), (vlan 54 - vip), (vlan 190 - snip) 

Configure primary node
set ns hostname NS1
*Setting the hostname of the netscaler
set int 0/1 -hamonitor off
*disabling monitoring of this interface so that if the 0/1 fails it won't cause a failover
set int 1/1 -tagall on
*ensure all frames are tagged regardless of being native vlan or not
set ns config -IPAddress -netmask
*configure NSIP
*Save the configuration and reboot so the IP takes effect
add HA node 2
*Configure HA. Yes, that's all there is
add vrid 6
bind vrid 6 -ifnum 1/1
*If you know VRRP, similar concept.  Its a virtual mac to help with HA failover speed. Vrid can be any value
add ns ip -vserver disabled -mgmtaccess enabled
add vlan 100 -aliasname LB_MGMT
bind vlan 100 -ifnum 0/1
*Add mgmt interface SNIP which gives address that will always ensure direct access to the current primary
add ns ip 
add vlan 54 -aliasname VIP_VLAN
bind vlan 54 -ifnum 1/1
bind vlan 54 -IPAddress
add route
*Add VIP vlan SNIP address.  Using GW in VIP vlan as the default route gateway
add ns ip
add vlan 190 -aliasname SNIP_VLAN
bind vlan 190 -ifnum 1/1 -tagged
bind vlan 190 -IPAddress
*Add SNIP vlan SNIP address.  Use -tagged because can only have a single untagged vlan.  If don't use, it'll overwrite and effectively unbind vlan 54 from ifnum 1/1
*save configuration

Configure secondary node
set ns hostname NS2
*Setting the hostname of the netscaler
set int 0/1 -hamonitor off
*disabling monitoring of this interface so that if the 0/1 fails it won't cause a failover
set int 1/1 -tagall on
*ensure all frames are tagged regardless of being native vlan or not
set ns config -IPAddress -netmask
*configure NSIP
*Save the configuration and reboot so the IP takes effect
add HA node 2
*Configure HA. Yes, that's all there is. 
sh ha node
*should show that the HA status is now UP for both nodes.  If that is true, it should already have sync'ed the configuration so if you do a 'sh run' you'll see teh SNIP and vlan configuration on the secondary.

NOTE: this is a very base configuration. It doesn't involve setting up any particulars with the interfaces like access security, any VIPs, etc.  That is outside the scope of this article.

Comments (1)



yes, I've done this many times before.  Other than what is on edocs I have ZERO images.  I have configs (of which I'm not allowed to share), that's it.  This article was not even proofread, this was purely from memory.  I'm sure others have more time which is why they can do images as well.  this article writing actually constituted about 90-95% of my available time I could dedicate to EE this month.  I just don't have time.  If you are wanting to be a stickler about images though, that is fine.  I will stop writing articles.

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.