Relayd sends to disabled hosts

Hi,

I'm currently using Relayd as a load balancer to a group of web servers.  I have it set up to use redirects, as I need to use the sticky-address feature.   However, when I disable one of the hosts, any existing connections STILL GO TO THAT HOST even though it's disabled.

I've tried flushing the pf state table, but no luck.   I'm guessing that if I took out the sticky-address directive, or moved to using relays instead of redirects it would also work as the sticky-address directive does not apply to relays.

So, is there a way to prevent existing connections from going to a disabled host using relayd?  Or is this a bug in relayd/freebsd?  

Further, if I need to go to relays vs. redirects, is there a performance hit?   With relays, will connections continue to go to the same host automatically unless it's disabled?
VeexAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
Duncan RoeConnect With a Mentor Software DeveloperCommented:
I thought you'd done that (getting retries to open a new connection). If that's what Relayd does, you need to file a bug report.
0
 
Duncan RoeConnect With a Mentor Software DeveloperCommented:
TCP segments on an existing connection have no choice but to keep trying to go to the same host.
Or do you mean something else?
0
 
VeexAuthor Commented:
I'm using a FreeBSD box as a load balancer, or more simply a reverse proxy, by using Relayd.  I can set a "sticky-address" option so that further requests from the same client go to the same host.  Without that setting, the requests get round-robin to only the active hosts.  With sticky-address set,  successive requests all go to the same host.

The problem I'm having is that even when Relayd marks a host as disabled, clients with any connections that were already established prior to the host becoming disabled will still go to that same host for successive requests.
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
Duncan RoeSoftware DeveloperCommented:
If new requests are piggy-backed onto existing connections then you will get that. Otherwise Relayd has a problem.
You can verify what is happening by using tcpdump or wireshark on a client system.
Hypothesis: if a host physically goes down (e.g. power loss or cable breaks) then connections with that host will not close. This may confuse Relayd.
Are you sure that Relayd has marked the host as down?
0
 
VeexConnect With a Mentor Author Commented:
Hi Duncan,

Thanks for your replies.  I was able to overcome this, and I'll explain what happened.


Relayd creates redirect rules for PF, which have the effect of load balancing connections to servers behind the load balancer as designed.  This works well.  

When a host is disabled in relayd, new connections  go to the other enabled hosts, but exisiting connections making additional requests continue to go to the original host they connected to.   I believe this is what you were alluding to.  

The problem I'm having is that PF ( the FreeBSD packet filter I'm using) checks the state table before the ruleset so that any existing connections don't need to re-traverse the rules.  This is done for efficiency.  I found a way to kill all states for the disabled host, which I've added to my script that handles the enabling/disabling.

That command is:

pfctl -k 0.0.0.0/0 -k <disabled_host_ip>

Once the states are killed, TCP retries create a new connection which goes to one of the enabled servers.
0
 
Duncan RoeSoftware DeveloperCommented:
Well done! Neat getting TCP retries to open a new connection - but how does the client cope?
0
 
VeexAuthor Commented:
Client does not cope well!  If a request comes in and then the server is disabled and the state cleared, the states for the request are lost and the client browser will just sit there and wait.  Any subsequent requests from the browser will be successful, which is better than the original problem I was having.

Im going to keep this question open while I look for a more elegant solution.
0
 
Duncan RoeSoftware DeveloperCommented:
Your open connection is the problem. Once Relayd becomes aware that a host is down, it needs to close its connection with the host and its matching with the client (they are separate connections). The close to the host will put that socket into close_wait state which will time out eventually. The close with the client will work straight away, with a better result than now.

I.e. instead of placing a new call on receiving [some number of] TCP retries, send a Reset. Simultaneously, close the connection to the offending host.
0
 
VeexAuthor Commented:
Thanks Duncan,

I would have expected that's what Relayd should be doing on it's own, but I think it's actually just creating temporary rules, adding them to the packet filter, and then passing the traffic off so that the relayd daemon isn't actually handling the TCP handshaking.  

I don't know the inner workings of Relayd and I'm making this assumption based on a few mentions here and there as I've been looking around.  I was hoping to hear something definitive from the community, but I'm not having luck there either.
0
 
VeexAuthor Commented:
Still looking for a better solution, but using this as a temporary work around.
0
 
VeexAuthor Commented:
Looks like the piece I was missing was the interval timeout.  This limits how long expired states stay around.   I'm guessing the expired states were somehow causing the clients to continue to go to the same hosts (which were now disabled) because of the expired states.   Setting the interval to 0 fixed that:


set timeout interval 0
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.