Haproxy reload behaviour

Hello everyone,

Hopefully you guys can provide some feedback on this behaviour.

Haproxy running version 2.3 (I know :smile: a bit old-ish).
200+ Frontends (its a chonky boy)
500+ Backends

Regular frontends & backends load balancing on HTTP
ALL servers run a Layer7 healthchecks. option httpchk GET /health HTTP/1.1\r\n

max-spread-checks 1
spread-checks 50
default-server inter 5s slowstart 60s rise 1 fall 3 on-marked-down shutdown-sessions

The reload settings:

hard-stop-after 180s

Behavior (what we are seeing)
During Reload operations, the new process is initiated successfully and sends the termination signal to the old one as expected.

However, since we are doing healthchecks on ALL servers, there are a few seconds in which the new process is already accepting connections, but the healthchecks have not yet run/succeeded so that the NEW haproxy replies with NOSRV/503.

Is this expected behavior? Are we missing some setting? Can we somehow consider servers UP (by default)?

We even tweak the rise and max-spread-checks to try and get healthchecks to be run as quick as possible.

Waiting some feedback.

P.S: We are only noticing this behaviour recently. Can it be related with the growing number of servers to check?


For whoever reads this.

We were not propagating the state context between reloads causing this behavior. Using this feature (HAProxy version 2.5.5 - Configuration Manual) we are able to propagate the healthcheck status between process.