FreeBSD, CARP, pf-sync, HAProxy

Hi,

I have configured HAProxy with stick tables and syncing between proxies, on 2 hosts using CARP with a shared IP, however when I cause a failover between the hosts by making the CARP IP move, my existing connection via HAProxy gets terminated.

My expectation is that these are being terminated as the tcp connections don’t exist on the CARP slave which takes over, however from my experience using pfSense, with CARP and pf-sync, I know it’s possible to have 2 machines share both an IP, and all the states which exist, in order to do failover and maintain existing connections.

I am wondering if it is possible to use both CARP with pf-sync and HAProxy with stick tables/peering, so that in the case of a CARP failover, the connection between the client and backend server can be maintained.

At the moment I am trying to get pf configured correctly, (and wondering if I have to do something funny like routing/natting the connection back to itself from the CARP ip to HAProxy on localhost)… Is this scenario something that is likely to work, or am I wasting my time, and regardless of what failover method is implemented, the connection is going to be broken when a failover takes place?

No, you can’t do it.

PF is a firewall and forwards Layer 4 packets in this case, while rewriting some of the L3 and L4 headers. Haproxy is an application which terminates a TCP connection and forwards its payload on layer 7.

It’s theoretically possible to migrate TCP connections under Linux (TCP connection repair), but the application has to sync everything layer 7 based, which is very difficult (and haproxy does not do that).

I don’t know if there is a similar API in FreeBSD, but haproxy cannot do this, whatever the OS may support.

1 Like

Thanks for the reply.

Hmm, well, that’s a bummer. I was thinking that PF would take care of layer 3/4, and the stick tables that replicate would deal with layer 5, and then the two endpoints wouldn’t need 6/7 to be replicated anywhere, since the transport would appear to not change, since the MAC and IP address of the HAProxy machine would stay the same.

In case it helps, this is what I’m (trying) to build:

Our internal application opens the connection (via HAProx) to RabbitMQ, and it holds it open and keeps reusing it (for as long as it stays open, ie days/weeks/longer…). The majority of the time this connection is idle, so there’s no payload/traffic to deal with syncing/recovering, just the connection state (although Murphy dictates that the proxy would fail when the connection was not idle).

I guess it would make more sense to be opening and closing the connection between transfers, rather than holding it open, and then whichever proxy/broker was up and running at that instant would deal with that transaction, negating the requirement to attempt to relocate an existing connection.

Somewhat related, I have been looking at the (newish) functionality of a HAProxy master and workers, and was considering implementing this (as we will eventually be putting more microservices up), but I guess that with this persistent connection, it would just result in a worker hanging around forever to service it?

No, MAC and IP addresses are unrelated to the problem. A layer 7 problem can’t be solved at Layer 2.

You probably need something protocol aware.

If you are using haproxy for this, you are using in TCP mode, which is not aware of the transactions.

I don’t know what any of this has to do with the master/worker architecture?

Are you talking about seamless reload?

Yes (whatever the configuration), persistent connections will block the old process on a reload, but the time can be limited with the hard-stop-after directive. A restart will just kill the old connections.

The correct solution for AMQP is to enable amqp heartbeats and use clients that support that. Restarting and cleaning up long running connections needs to be handled at the application layer even if lower layer levels have their own heartbeat functionality.

OK, so I took a step back, and implemented this using PF rules alone; ie rather than accepting/allowing the traffic through the CARP IP to HAProxy, I setup rules to NAT traffic that comes into the CARP IP to be forwarded back out on the CARP IP (as HAProxy was doing), and port forward the port that HAProxy was bound to, to connect it directly to RabbitMQ.

As a result of this, I am able to force a failover of the CARP IP to the secondary HAProxy node (without HAProxy running, with the same pf rules loaded) by taking down the interface of the primary, and due to pf-sync passing the states across to the backup node, my connection to RMQ is maintained, with only a ~1 second pause while the CARP IP moves, and otherwise neither end notices, ie the connection is not closed/dropped etc. I am also able to move it back again by bringing the primary node’s interface back up, and then taking the secondary interface down (as I do not have CARP configured to automatically preempt).

I realise I have gone totally offtopic and out of scope from HAProxy here, but I am just wondering what the advantage of using HAProxy in tcp mode is, and/or what the point of synchronised stick tables is, if it is unable to maintain an existing connection during/after a failover.

Is HAProxy just the wrong tool for the job for what I am trying to do? I ask this since PF supports tables of IP addresses, and I am able to add round-robin rules to NAT/proxy traffic to those IPs, along with source/destination stickyness etc, and I am able to dynamically add/remove IPs from the group, which seems to offer identical functionality to HAProxy, making it redundant.

I don’t mean for this post to come across as mean or belligerent etc, as reading it back it does sound that way, I’m just seriously trying to understand what the differences are between the functionality that CARP/pf/pfsync offers, vs HAProxy is.

Is it just that HAProxy offers additional/advanced functionality, which PF does not, such as checking of servers (which I am thinking I can implement as a shell script using nc -z and dynamically altering the PF rules), or other functionality which I am not using, such as SSL enc/decryption to reduce the load on backend web servers, and the stats interface etc?

The reason I ask is because I was given 2 standalone HAProxy machines (using CARP, without pf etc), and tasked with finding a way to make them properly redundant and able to maintain the existing connections in the case of a failover (which is why I was excited to find and implement the stick table replication functionality etc), and now it seems that I have been able to make the 2 machines able to proxy and failover seamlessly without using HAProxy at all, and I’ll probably have to explain how/why etc.

Thanks, and sorry for rambling a bit.

It’s difficult for a userspace application that is relaying TCP traffic to migrate an in-flight TCP sessions. I don’t know any - in the context of load-balancers or reverse-proxying - that can currently do this.

On the other hand, it’s not that difficult for a kernel that forwards this in L3/4. I’m sure this can be done with ip/nftables also.

I’d assume you have advantages regarding health checking, management, monitoring, logging as well as apply different features such as SSL termination, or content switching based on (some) layer 4 payload.

Stick tables can be used for a number of things, none of it has to do with what you achieved with pf.

The number 1 use-case for stick-tables is client-persistence. Whether your client connects to the primary or the secondary haproxy instance, it will always be load-balanced to the same backend server. This is often an application requirement, when backend servers don’t share the same underlying data (storage/databases/other session storage).

So when the primary instance goes down, and the client reconnects to the secondary haproxy node (after the former connection is dropped), it will connect to the same backend server than before.

Other reasons can be anti-ddos counters and abuse mitigation. I’m sure there are others.

Long runing TCP sessions that can’t be interrupted are always a problem in these setups. But usually when revising the actual use-case, they are not actually required, and the applications can run some kind of keepalive and to be able to handle a shutdown gracefully.

If there’s a hard requirement to not drop the TCP sessions on a hardware/node failure, than yes, haproxy and many other products won’t be able to cover it, while straightforward L3/4 forwarding can do this easily.

Don’t worry, I get it. Your questions make perfect sense.

Simon I’m interested to see how you managed the pf part of this. Maybe you can share that?