Necessity of using both 'balance source' and 'stick on src' in HAProxy configuration

Dear HAProxy community,

I’m currently configuring HAProxy for a SOCKS5 proxy setup, particularly to handle connections to websites like https://claude.ai. I have the following configuration:

listen socks5
    bind    0.0.0.0:16667
    
    balance source
    hash-type consistent
    stick-table type ip size 1m expire 60m
    stick on src

My questions are:

  1. Is it necessary or beneficial to use both balance source and stick on src in this configuration?

  2. What are the implications of using these directives together? Do they complement each other, or is one redundant?

  3. Given that I’m using hash-type consistent, how does this interact with the balance source and stick on src directives?

  4. For a SOCKS5 proxy setup, especially when dealing with websites that might require session persistence, what would be the recommended configuration?

  5. Are there any potential drawbacks or performance implications of using this combination of directives?

I would greatly appreciate any insights or best practices you could share regarding this configuration.

See Necessity of using both 'balance source' and 'stick on src' in HAProxy configuration. · Issue #2703 · haproxy/haproxy · GitHub for the related discussion.

Thank you for your time and expertise.

Best regards,
Zhao

Hi Zhao !

1. Is it necessary or beneficial to use both balance source and stick on src in this configuration?

Combination of balance source and stick on src will be beneficial. This give you a high level of session persistence with a proper load balancing, in case if some backends will go down.

The expiration time of the source IP entry in your stick table is 60m. So, after this time, if the given source IP has not reconnected, its entry will expire. If then, this given source IP will reconnect again, say after two hours or more, with balance source there is a very high chance (99.9%), that it will be served by the same server, which has served them at the first time. I mention very high chance, because the server could go down meanwhile or it could changed the weight and in this particular case the given source IP will be balanced to another backend and its entry will be created in the stick-table again and its session will be again persistent at least during the 60m.

So,

    stick-table type ip size 1m expire 60m
    stick on src

guarantee session persistence for same source IPs which connect and reconnect frequently (N times during 60m)

balance source

Will guarantee session persistence for IPs which connect a once/twice per some hours, or per day, with the condition that backend servers not change their states (up/down) frequently.

I hope this also answers your second question:
2. What are the implications of using these directives together? Do they complement each other, or is one redundant?

So, to sum up, I would say that it is rather complement, than redundant.

3. Given that I’m using hash-type consistent, how does this interact with the balance source and stick on src directives?

Here I can’t add more, than we have in the description of hash-type consistent from the doc.
hash-type consistent is dynamic, and suites better to cases when backends can flip up/down, ideal for caches. You could try this one and if the load distribution will be not smooth, we can see further how it could be adjusted with servers IDs, weights and different hashing algorithms. Here some trials with traffic are needed to have a better tuning.

1 Like

4. For a SOCKS5 proxy setup, especially when dealing with websites that might require session persistence, what would be the recommended configuration?

we support now socks4 option to connect to backend servers

http://docs.haproxy.org/dev/configuration.html#socks4

usage:

backend b7
server s1 : socks4 localhost:18000
server s2 : socks4 192.168.0.1:18001

There is no any support for SOCKS at frontend side.

As for the SOCKS5 to backends there is no support for the moment.

You could try simple TCP load balancing with session persistence. As for an example you can start from this basic configuration:

defaults
log global
mode tcp
option tcplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000

listen socks5
mode tcp
bind 0.0.0.0:16667
balance source
… other settings which you listed

I’ve also found this blog, but I’m not tried this configuration:
https://dft.wiki/?p=3336

And than you can add on top of this simple config, settings necessary for the IP-based persistence, as you’ve done and what is was discussed above.

Here we also have a short doc about it:

Stick table lookup will not impact the overall performance, we’ve done some benches internally and it was proven, that TCP connection establishment even 10 times slowly, than this lookup for IP source.

As the most of the clients use HTTPS we haven’t done yet any benches for SOCKS. Performance for balancing TCP will depend from many factors, system architecture, CPU, OS tuning, tcp options tuning. So it will be better that you will do some trials and than return to us, if they will be some problems on this point.