We are using HAProxy for a while now and are using a new option: http-check disable-on-404.
For a few days this seemed to be working properly, until servers are ‘switching’.
As you can see, server002 was OK, stating ‘conditionally succeeded’, returning status 404.
Until server002 aws switching to UP temporary, returning status 200. Then it failed, first with “Connection refused”. This of course should result in a failed state. But when the server recovered and returned status 404 again, it was still in the failed state. The only way to get it back as ‘conditionally succeeded’, was reloading HAProxy. I believe this is a bug.
Below the configuration:
backend backend
mode http
balance roundrobin
option allbackupshttp-reuse always # health checks option httpchk GET /isActive http-check disable-on-404 http-check expect status 200 default-server slowstart 30s check inter 10s fall 3 rise 3 server server001 10.10.0.1:8080 weight 100 server server002 10.10.0.2:8080 weight 100
Below the log output:
Jan 29 12:57:22 loadbalancer haproxy[18143]: Health check for server backend/server001 succeeded, reason: Layer7 check passed, code: 200, info: "HTTP status check returned code <3C>200<3E>", check duration: 1ms, status: 3/3 UP. Jan 29 12:57:22 loadbalancer haproxy[18143]: Health check for server backend/server002 conditionally succeeded, reason: Layer7 check conditionally passed, code: 404, info: "Not Found", check duration: 2ms, status: 3/3 UP. Jan 29 12:57:22 loadbalancer haproxy[18143]: Server backend/server002 is stopping. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Jan 29 13:21:02 loadbalancer haproxy[18143]: Health check for server backend/server001 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 2/3 UP. Jan 29 13:21:02 loadbalancer haproxy[18143]: Health check for server backend/server002 succeeded, reason: Layer7 check passed, code: 200, info: "HTTP status check returned code <3C>200<3E>", check duration: 1ms, status: 3/3 UP. Jan 29 13:21:02 loadbalancer haproxy[18143]: Server backend/server002 is UP. 2 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Jan 29 13:21:12 loadbalancer haproxy[18143]: Health check for server backend/server001 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 1/3 UP. Jan 29 13:21:22 loadbalancer haproxy[18143]: Health check for server backend/server001 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 0/3 DOWN. Jan 29 13:21:22 loadbalancer haproxy[18143]: Server backend/server001 is DOWN. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. Jan 29 13:21:32 loadbalancer haproxy[18143]: Server backend/server002 is UP. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Jan 29 13:22:59 loadbalancer haproxy[18143]: Health check for server backend/server001 failed, reason: Layer7 wrong status, code: 404, info: "HTTP status check returned code <3C>404<3E>", check duration: 4ms, status: 0/3 DOWN. Jan 29 13:23:29 loadbalancer haproxy[18143]: Health check for server backend/server001 succeeded, reason: Layer7 check passed, code: 200, info: "HTTP status check returned code <3C>200<3E>", check duration: 2ms, status: 1/3 DOWN. Jan 29 13:23:32 loadbalancer haproxy[18143]: Health check for server backend/server002 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 2/3 UP. Jan 29 13:23:39 loadbalancer haproxy[18143]: Health check for server backend/server001 succeeded, reason: Layer7 check passed, code: 200, info: "HTTP status check returned code <3C>200<3E>", check duration: 2ms, status: 2/3 DOWN. Jan 29 13:23:42 loadbalancer haproxy[18143]: Health check for server backend/server002 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 1/3 UP. Jan 29 13:23:49 loadbalancer haproxy[18143]: Health check for server backend/server001 succeeded, reason: Layer7 check passed, code: 200, info: "HTTP status check returned code <3C>200<3E>", check duration: 3ms, status: 3/3 UP. Jan 29 13:23:49 loadbalancer haproxy[18143]: Server backend/server001 is UP. 2 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Jan 29 13:23:52 loadbalancer haproxy[18143]: Health check for server backend/server002 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 0/3 DOWN. Jan 29 13:23:52 loadbalancer haproxy[18143]: Server backend/server002 is DOWN. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. Jan 29 13:24:19 loadbalancer haproxy[18143]: Server backend/server001 is UP. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Jan 29 13:25:12 loadbalancer haproxy[18143]: Health check for server backend/server002 failed, reason: Layer4 timeout, check duration: 10000ms, status: 0/3 DOWN. Jan 29 13:25:22 loadbalancer haproxy[18143]: Health check for server backend/server002 failed, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms, status: 0/3 DOWN. Jan 29 13:25:32 loadbalancer haproxy[18143]: Health check for server backend/server002 failed, reason: Layer7 wrong status, code: 404, info: "HTTP status check returned code <3C>404<3E>", check duration: 4ms, status: 0/3 DOWN.