Sticky connections stopped after ha proxy config change

Hi,

We had an issue with persistence on source IP address stopping after a simple change to the haproxy config.
The change we made was to add a new backend server but mark it disabled in the config. As soon as this went live our stats page didnt refresh the backend server status and the persistence stopped.
We run the following command after changes to gracefully commit the config.
haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid)

This was a big problem for us and the issue remained after the servers were rebooted.

And the config is below:
#
log 127.0.0.1 local2

chroot      /var/lib/haproxy
pidfile     /var/run/haproxy.pid
maxconn     4000
user        haproxy
group       haproxy
daemon

# turn on stats unix socket
# stats socket /var/lib/haproxy/stats
listen stats x.x.x.x:1936
timeout connect 15m
mode http
stats enable
stats scope frontend_sites_http
stats scope ssl.ferries_http
stats scope ssl.ferries_https
stats scope ws.ferries_http
stats scope ws.ferries_https
stats scope https
stats uri /
stats hide-version
stats auth admin:

defaults
mode http
log global
option tcplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 2h
timeout queue 2h
timeout connect 10s
timeout client 2h
timeout server 2h
timeout http-keep-alive 30s
timeout check 30s
maxconn 3000

frontend frontend_sites_http
mode http
bind x.x.x.x:80
reqadd X-Forwarded-Proto:\ http
default_backend frontend_sites_http

frontend ssl.directferries_http
mode tcp
option tcplog
bind x.x.x.x:80
default_backend ssl.directferries_http

frontend ssl.directferries_https
bind x.x.x.x:443 ssl crt /etc/ssl/certs/ no-sslv3 ciphers
option forwardfor
option http-server-close
reqadd x-Forwarded-proto:\ https
acl blockedagent src -f /etc/haproxy/abusers.lst
http-request deny if blockedagent
default_backend ssl.ferries_https

frontend ws.directferries_http
mode tcp
option tcplog
bind x.x.x.x:80
default_backend ws.directferries_http

frontend ws.directferries_https
bind x.x.x.x:443 ssl crt /etc/ssl/certs/ no-sslv3 ciphers
option forwardfor
option http-server-close
reqadd x-Forwarded-proto:\ https
acl blockedagent src -f /etc/haproxy/abusers.lst
http-request deny if blockedagent
default_backend ws.ferries_https

backend frontend_sites_http
balance leastconn
option forwardfor
default-server inter 5s
stick-table type ip size 200k expire 2hr
stick on src
option tcp-check
server lin1 x.x.x.x:80:80 check port 80
server lin2 x.x.x.x:80:80 check port 80

backend ssl.ferries_http
balance leastconn
option forwardfor
default-server inter 5s
stick-table type ip size 200k expire 2hr
stick on src
option tcp-check
server web1 x.x.x.x:80 check port 80
server web2 x.x.x.x:80 check port 80
server web3 x.x.x.x:80 check port 80 disabled

backend ssl.ferries_https
balance leastconn
option tcplog
option forwardfor
stick-table type ip size 200k expire 2hr
stick on src
option tcp-check
default-server inter 3s
server web1 x.x.x.x:443 check id 1 ssl verify none
server web2 x.x.x.x:443 check id 2 ssl verify none
server web3 x.x.x.x:443 check id 3 ssl verify none disabled

backend ws.ferries_http
balance leastconn
option forwardfor
stick-table type ip size 200k expire 2hr
stick on src
default-server inter 5s
option tcp-check
server web1 x.x.x.x:80 check port 80
server web2 x.x.x.x:80 check port 80
server web3 x.x.x.x:80 check port 80 disabled

backend ws.ferries_https
balance leastconn
option tcplog
option forwardfor
stick-table type ip size 200k expire 2hr
stick on src
option tcp-check
default-server inter 3s
server web1 x.x.x.x:443 check id 1 ssl verify none
server web2 x.x.x.x:443 check id 2 ssl verify none
server web3 x.x.x.x:443 check id 3 ssl verify none disabled

There is nothing in the configuration that would mantain your stick-table at reload/restart.

Read:
https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#4-stick-table

and:
https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#3.5

Hi,

I didnt mean maintaining the stick-table after restart. I meant sticky sessions stopped working altogether.
Even after I changed the config back connections would jump from one server to the other.
We had to restore the server to resolve the issue but I need to establish how to fix this on our remaining broken server.

What does “restore the server” mean?

Hi,

I read through the documentation it mentions peers as other haproxy hosts. We just have customers connecting to a single haproxy that connects to the backend servers. The stick-table must be referencing the frontend ok to get IP addresses as this has worked for months in a live environment.
Also This documentation is for version 1.7.5 I should have said we use 1.5.18.

By “restore the server” I mean we restored the virtual machine from a veeam backup from a previous day for the haproxy to work with sticky connections again.

What do you suggest we try? Could you provide an example for how you would code the persistence based on source IP address between the frontend and backend servers?

Thanks

Joe

What you are saying then is:

  • after a configuration change, stickieness stopped working
  • a configuration rollback did not fix the issue
  • a VM reboot did not fix the issue
  • restoring a snapshot of the VM from a previous backup fixed the issue

This doesn’t make sense, as haproxy behavior only depends on its configuration file, nothing else (that would be restored from a snapshot, for example).

Probably the configuration change was more intrusive than you think, and the config roll-back did not really roll back all changes (which is why restoring a snapshot was the only way to recover from this).

There’s is nothing to be analyzed on the haproxy end, I suggest you investigate exactly what changed on this VM (if in haproxy configuration, or otherwise) and/or try to reproduce it.