I have a conceptional question to the high availability setup with HAProxy 2.2.
Our server setup is simple, we have production server S1 in one data center and a second one S2 in a different data center within a different country. On failure within one data center, the HAProxy switches to the backup server in the other data center.
For that, our backend config looked like this:
global
daemon
maxconn 30000
tune.bufsize 1048576
...
defaults
option http-buffer-request
option http-keep-alive
option forwardfor
timeout http-request 10s
timeout connect 5s
timeout client 30s
timeout queue 15s
timeout server 100s
timeout http-keep-alive 10s
retries 3
unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid
unique-id-header X-HA-Request-Id
option log-health-checks
no option logasap
timeout client-fin 30s
timeout tarpit 15s
...
backend A
balance first
option httpchk GET /healthcheck HTTP/1.0
server S1_5557 1.2.3.4:5557 check inter 5000
server S2_5558 5.6.7.8:5558 check inter 5000 backup
retries 3
option redispatch 1
retry-on 502 503 504 empty-response conn-failure 0rtt-rejected
http-request disable-l7-retry if METH_POST
We’ve noticed that in this setup, as soon as we shut down the S1 server, it takes 5sec until the S2 server takes over. Requests coming in between two healthchecks get a HTTP 503 and are not forwarded to the backup automatically.
The assumption was, that the retry-on
option will trigger a replay of current request against the backup server as soon as a http 503 code is detected.
It works, if we remove the backup
option from the S2 server. As the backup server is in a different country, we do not want to have it in constant use (by removing the backup
option from it).
How would someone set up such a use case as we have it? The only requirement would be that no GET
request gets lost and is always served by one of the environments.