Hi all,
I have a setup with multiple HAProxy servers (running 1.6.3 from haproxy.org) balancing for a number of backend servers using stick-table
to replicate backend target choices between the peers. The sessions coming in are long-lived and need to all hit the same backend for the same logical grouping. (X-Foobar
header value)
I also have a script that runs periodically to discover the existence of backends and will recreate the haproxy.cfg
file in the event that any have been added or removed. (for blue/green deploys) It will recreate the config file and then issue a soft reload (haproxy -sf
).
I’ve noticed, though, that sometimes right after the reload a new session will come in and be sent to a backend that was different than that was chosen prior to the reload. See the logs in this paste (haproxy.cfg
also in there):
https://gist.github.com/codeslinger/7c631fd18b30c41b57a23e949cf12d58
Note in the haproxy.log
section therein that the connection that came in at 21:27:11
was placed on a different backend than the one that came in at 20:59:10
, even though they had the same X-Foobar
header value. The reload occurred at 21:27:03
. There were other sessions with this same X-Foobar
value that came in before the reload on all the peer HAProxys that also were directed to the correct backend. I imagine that means the record was in the sticktable and replicated properly prior to the reload, no?
My guess is that there is a race condition whereby the new process attaches to the listening ports and starts servicing new sessions prior to receiving any/all of the sticktable data from the old process. I’ve confirmed in the source code that the listening ports are bound in the new process before the SIGUSR1
is issued to the old process to tell it to stop service, but I can’t seem to find where the sticktable data is sent to the new process quite yet. (i.e. no smoking gun for a bug report)
Does anyone have any ideas on how I can fix/workaround this issue? Given the nature of our sessions, if they don’t all hit the same backend, its a really bad experience for our clients. I would sure appreciate any help anyone had to give. Thanks!