KeepAlive issue

Hi, I’ve got HAProxy as a reverse proxy to balance requests on to a backend of nodes, but for some reason if the client sends a request with HTTP keep-alive then HAProxy in TCP mode doesn’t rotate between the backend servers, but HAProxy in HTTP mode does. Both modes balance in round-robin for each request if HTTP keep-alive is turned off at the client.

Here’s the basic config:

listen nodes_proxy
        mode tcp
        bind :9090
        balance roundrobin
        timeout client 40s
        timeout server 40s
        retries 1
        retry-on conn-failure
        option redispatch 1
        server node1 x.x.x.x:8080
        server node2 x.x.x.x:8080
        server node2 x.x.x.x:8080

All backend nodes are active and working, but HAProxy in TCP mode doesn’t do round-robin balancing on each request from the same client with keep-alive, while HTTP mode does. For our simple use case, we prefer the TCP mode, but it doesn’t balance as expected.

How can I ensure that each request from the client (with keep-alive) is balanced in round-robin mechanism, and client-to-proxy connections as well as proxy-to-server connections are kept alive for reuse until timeout?

This is expected behavior in tcp mode, because tcp mode means that two sockets are connected to each other and everything is passed through.

Only in HTTP mode does haproxy understand where one response ends and a request begins.

Thanks for clarifying Lukas. HAProxy (v2.2) docs don’t clearly mention that, and tons of configs had TCP mode balancing. It’s puzzling because round-robin balancing in TCP mode works fine if keep-alive is turned-off on the client. Would you know why that works though?

Like I said, tcp mode means that haproxy connects two TCP sockets with each other.

When you are using keep-alive on the client and the server, than one TCP connection handles multiple HTTP transactions (a transaction is a request from the client and a response from the server). When you are not using keep-alive, then a single TCP connection only handles a single HTTP transaction.

That’s why when using keep-alive on the client and server and tcp mode in between on haproxy, transactions within a single TCP connection are not load-balanced.

1 Like

Thanks Lukas. It makes sense.

In my case, the client is a single node (app) in location A, and backend nodes are all Web proxies in multiple locations (A, B, C), which I’m trying to load balance using HAProxy hosted in location A, so that each request (with keep-alive) from the client goes to a different backend proxy (rotating server).

While TCP mode is more performant (CPU time) on the proxy server and its balancing will work with client keep-alive off, but it has an overhead (~100ms) for each new connection. HTTP mode may be less performant (~15% more CPU time as per the Starter Guide), but with client keep-alive on it’s going to be faster in a high-throughput scenario. Am I correct in my understanding?

I’ll also appreciate if you or other experts can share any tip or optimization I should be aware of based on the above use case.

Yes, however it is very important to understand that this is irrelevant in 99,99% of the deployments, assuming you are not Google.

Turning off keep-alive means additional rounds trips of packets per transaction that is for certain, which is why I’d discourage everyone from disabling keep-alives.

You should definitely use keep-alive on your servers and clients.

Slower does not mean slow. It does more, therefor it needs more CPU cycles. You are probably not running a triple datacenter production work load behind a load-balancer consisting of a single raspberry pi, so please stop worrying about CPU cycles for now.

You want the benefits of HTTP mode? Then use it.

FWIW: HTTP mode is NOT 15% slower than TCP mode. Instead the relation between userspace and kernel CPU load shifts about 15% between the two, in a fully loaded scenario.

Last thing: I’m not sure it’s a bad thing for application that one frontend connection creates one backend connection: this way caches are hot and this often means more performance, not less. But this will certainly depend on what the backend servers are doing exactly.

1 Like