TCP proxy going straight to the backend

Hello,
I configured HAproxy so it forwards tcp requests to set of 3 backend servers and created health check that is working fine. At a time only one of the backend servers is available as others are slaves and not allowed for writing.
Each of the 3 servers runs nginx proxy to a python app that connects to postgresql. As postgresql does not support multi-master i configured it with master and slaves and automatic failover. In case a node is down, it promotes another node to master.

The health check seems to be working as i see in the logs it always shows 2 backend nodes down and the other up.
If i open the app in web browser, it connects to HAproxy and then forwards to the healthy backend node.
Next i simulate node failure and automatic failover promotes another node as master. The health check detects this fine, it changes the live backend node in HAproxy and sets the other 2 down.
However if i reload the web page in the browser it still connects the old backend node. I can see in it’s logs the request is going to it instead of the healthy node.
If i open new incognito window in the browser it is working fine, request is processed by the healthy node but reload in previous browser window goes to the down node.

Why is it this happening?

I figured this out.
I added an option to default-server

on-marked-down shutdown-sessions

Now it works as expected. TCP connection goes to the live server instead of redirecting it to the down server.
This behavior should be set by default. No one needs old sessions to down backends.

1 Like

Just out of curiosity since I can’t confirm this behaviour on the versions I’m using … Which version of Haproxy? Which balance method are you using? Any stickiness configured maybe?

Balancing method and stickiness are irrelevant.

Haproxy will never break existing, working TCP sessions, just because health checks mark the server as down; that is unless on-marked-down shutdown-sessions is configured, which has been implemented for this exact use-case.

What is the benefit of keeping existing tcp sessions to non healthy backends?

My 2p…

Rate/Connection limiting on the backend server.

Also if the connection is still established it may still be working so if your app cares about state it’s better not to move working connections.

Do not think limiting is the reason.
You can’t understand. An unhealthy server must not serve connections or this will crash the frontend app. It is a postgresql master-slave cluster and if tcp connection is not destroyed the unhealthy server will be unable to do write operations in the database. So the most logical action is close TCP sessions if HAproxy sees a backend unhealthy.

It perfectly works with on-marked-down but default behavior may confuse many users making them debug why connections are still directed to non healthy backend.

The problem is you are only thinking about your use case.

For you, I entirely agree which is why the option is there for you to use.

Are you suggesting that HAProxy change its default behavior and that they add an “on-marked-down keep-sessions” option? If so I’d suggest you take that to the mailing list.

1 Like

Only trying to understand who will need a tcp connection to a backend reporting non healthy.

It is simply not true that a failed health check automatically means that established sessions are all broken. Actively destroying a established sessions is harmful in those cases.

This is why its configurable.

I guess as others pointed it depends on the user case. Think of master-slaves backends. When the master moves to another server you don’t want to break the connections to the old one just because the health check does not show it as master any more. It is still up and can finish serving the existing connections instead resetting them under the client’s feet by default.

Master-slaves is exactly my case. Do you think a slave can serve normal request? Think again. Its postgresql and slaves are read only. My health check code checks if server can process normal request (read/write). If server is unhealthy there is no reason of serving requests.
In your case if master moves to another server and slaves can serve equal as master then it is not master-slave but multi master. Now imagine the server crashes and health check reports down. With your logic the connection will still be opened and serving requests - the result will be very bad error message.

A multi-master setup then? Would that make sense?
BTW, What exactly is the haproxy backend here? Nginx or Postgresql? I was under impression it was Nginx.

It’s nginx that forwards the request to Odoo and Odoo connects to postgresql that runs on localhost.
Postgresql is in master-slaves mode with automatic failover so if node goes down a live node is promoted.
All this runs on virtual machine x 3. The VMs are on different bare metal servers.
Haproxy does health check on all the three virtual machines where healthy status is a fully functional odoo app: nginx working on port 443, Odoo instance working on port 8069 and postgresql in read-write mode.

As bare metal servers are updated every week including kernel updates that need reboot, the above infrastructure guarantees no downtime.

Btw. in a multimaster sertup (useful for mysql) if a server reports unhealthy (like mysql crash) then it’s wrong serving from failed node as error will be displayed to the end user.