Can’t switch to backup server
When I set the health check to fail while requests were flowing to the primary server (e.g. by stopping the primary server’s service (port:8080)),
I was hoping that the requests would flow to the server with the backup option set,but all requests returned a 503 error.
The log says “Running on backup”, but it doesn’t switch to the backup server.
2024-12-16 17:34:00.023 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 1045ms, status: 9/10 UP.
2024-12-16 17:34:02.381 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 9/10 UP.
2024-12-16 17:34:04.054 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 1029ms, status: 8/10 UP.
2024-12-16 17:34:05.383 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 8/10 UP.
2024-12-16 17:34:07.055 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 7/10 UP.
2024-12-16 17:34:09.431 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 1047ms, status: 7/10 UP.
2024-12-16 17:34:10.057 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 6/10 UP.
2024-12-16 17:34:13.463 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 1031ms, status: 6/10 UP.
2024-12-16 17:34:14.102 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 1045ms, status: 5/10 UP.
2024-12-16 17:34:17.102 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 4/10 UP.
2024-12-16 17:34:17.495 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 1031ms, status: 5/10 UP.
2024-12-16 17:34:20.104 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 3/10 UP.
2024-12-16 17:34:20.495 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 4/10 UP.
2024-12-16 17:34:23.105 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 2/10 UP.
2024-12-16 17:34:23.496 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 3/10 UP.
2024-12-16 17:34:26.105 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 1/10 UP.
2024-12-16 17:34:26.498 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 2/10 UP.
2024-12-16 17:34:29.107 Health check for server be_default/server2 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 0/5 DOWN.
2024-12-16 17:34:29.107 Server be_default/server2 is DOWN. 1 active and 2 backup servers left. 36 sessions active, 0 requeued, 0 remaining in queue.
2024-12-16 17:34:29.498 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 1/10 UP.
2024-12-16 17:34:33.559 Health check for server be_default/server1 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 1060ms, status: 0/5 DOWN.
2024-12-16 17:34:33.559 Server be_default/server1 is DOWN. 0 active and 2 backup servers left. Running on backup. 37 sessions active, 0 requeued, 0 remaining in queue.
Even though the above log is output, all requests will result in a 503 error.
2024-12-16 17:34:33.856 172.16.0.1:40000 [16/Dec/2024:17:34:27.223] fe_all be_default/sever2 0/1599/-1/-1/6633 503 217 - - SC-- 1429/1429/1428/10/3 0/44 "POST /app/app HTTP/1.1"
2024-12-16 17:34:33.985 172.16.0.1:47000 [16/Dec/2024:17:34:32.385] fe_all be_default/<NOSRV> 0/1599/-1/-1/1599 503 217 - - sQ-- 1419/1419/1417/0/0 0/20 "POST /app/app HTTP/1.1"
Is the setting incorrect?
Supplementary information
- If I stop the haproxy service while the primary server service is stopped, and then start it after a certain time, requests will flow to the backup server.
- If I stop the primary server service while no requests are being sent, and then send requests after that, they will flow to the backup server.
version: 2.6.13
config : as below
###
### /etc/haproxy/haproxy.cfg
###
# Basic config mapping a listening IP:port to another host's IP:port with
# support for HTTP/1 and 2.
global
chroot /var/lib/haproxy
log 127.0.0.1 daemon
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
quiet
nbthread 1
stats socket /var/lib/haproxy/stats.socket user app group appgrp level admin
stats timeout 2m
defaults
mode http
log global
option log-health-checks
option httplog
option httpclose
retries 3
timeout http-request 10s
timeout connect 3s
timeout client 1m
timeout http-keep-alive 10s
timeout check 3s
frontend fe_all
bind *:8080
default_backend be_default
maxconn 1000000
backend be_default
balance roundrobin
option httpchk GET /app/healthCheck
option allbackups
timeout queue 1600ms
timeout server 1600ms
server servier1 server1.net:8080 maxconn 100 check inter 3s fall 10 rise 5
server servier2 server2.net:8080 maxconn 100 check inter 3s fall 10 rise 5
server servier3 server3.net:8080 maxconn 100 check inter 3s fall 10 rise 5 backup
server servier4 server4.net:8080 maxconn 100 check inter 3s fall 10 rise 5 backup
#--EOF--