From reading https://cbonte.github.io/haproxy-dconv/1.8/management.html#4 , when haproxy reloads, a new process is created and binds to the same port as the old one. Then when all connections to the old one have ended, the old process dies.
My question is: during this period when both processes are alive, can new connections go to the old process? There is nothing on the page linked above that explicitly says new connections can’t be handled by the old process, but it would seem odd if this was possible. However I’m seeing issues where this appears to be occurring. This causes errors for these new connections because in my situation, the old process no longer has valid backends.
I’m using haproxy inside a Docker container is that is relevant.
As @lukastribus states, the old processes will only handle existing connections. If your old backends ‘go away’ during a wider process that includes the HAProxy reload and before these connections have naturally drained off, it would be expected you’d see errors.
@sjiveson
Ok thanks - I’ll have to do some more digging. I think http keepalives are causing problems by keeping TCP connections open connected to the old (and now incorrect) haproxy instance. Out the box It’s hard to keep the old backends alive - I’m using Docker with it’s built in rolling updates, and that redeploys all containers in a service in sequence.
I’m unclear about the docs a bit though. option httpclose says
Note that this option is deprecated since what it does is very cheap but not
reliable. Using "option http-server-close" or "option forceclose" is strongly
recommended instead.
instead I think I probably want to set timeout http-keep-alive to quite a low value? Currently I have timeout client 50000 which is the default value from the upstream project. The docs say that the timeout http-keep-alive value will default to this. I think this is my problem, refreshing a browser page within 50sec will presumably have kept the old TCP connection open still.
There seem to be a lot of different timeout settings, is 50sec an appropriate value for timeout client? The docs just advise setting it slightly more than a multiple of 3.
I’m not using systemd so not sure what those bugs are. If haproxy closes the connections anyway, what is the purpose to the http-alive-alive setting?
Not being a network engineer, I’m pretty lost in all the settings and configuration options. I think I’m going to have to set up some semi-automated tests firing ajax requests from a browser at an endpoint, whilst trying various combinations of settings and see what happens.