HA Proxy - Gracefully close connections on servers

We are running into a snag in our deployment process. While removing servers from rotation, persistent connections are dropped. We are currently using cookie based persistence. We’d like the connections to gracefully move to another server, instead of being dropped.

Here is what we are doing to remove a server from rotation:

  1. Change the state of the server to DRAIN (via socat command). This command disallows any new connections to be on the server, however persistent connections are still hitting our server.
  2. Change “health.html” contents to “DOWN”. This marks the server as “DOWN”, but all connections are dropped and users bounced to another server.

We are unable to determine the step we are missing between #1 & #2. We have tried the following:

  • Incorporating the “MAINT” status
  • Setting the maxconn value on a server to -1
  • Renaming the “health.html” file instead of changing the contents. This causes the server to be marked as “NOLB”

Does anyone have any suggestions?

Below is the HA Proxy config:

    global
        maxconn 30000
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon
        nbthread 48

        tune.bufsize 32768
        tune.ssl.cachesize 30000
        tune.ssl.lifetime  600

        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private

        ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
        ssl-default-bind-options no-sslv3

        stats socket ipv4@127.0.0.1:9999 level admin
        stats socket /var/run/haproxy.sock mode 666 level admin

    defaults
            log     global
            mode    http
            option  httplog
            option  dontlognull
            timeout connect 121000
            timeout client  121000
            timeout server  121000
            errorfile 400 /etc/haproxy/errors/400.http
            errorfile 403 /etc/haproxy/errors/403.http
            errorfile 408 /etc/haproxy/errors/408.http
            errorfile 500 /etc/haproxy/errors/500.http
            errorfile 502 /etc/haproxy/errors/502.http
            errorfile 503 /etc/haproxy/errors/503.http
            errorfile 504 /etc/haproxy/errors/504.http

    frontend fe_main
            bind :80
            bind :443 ssl crt /etc/cc-ssl/[redacted].pem crt /etc/cc-ssl/[redacted].pem
            reqadd X-Forwarded-Proto:\ https

            http-request redirect scheme https unless { ssl_fc }

            default_backend be-https

    frontend stats
            bind *:8404
            stats enable
            stats uri /stats

    backend be-https
            balance roundrobin
            cookie NUMID insert indirect nocache
            option httpchk GET /health.html HTTP/1.1\r\nHost:\ www
            http-check disable-on-404
            http-check expect string UP
            default-server inter 3s fall 2 rise 2 slowstart 5m
            server s1 10.10.10.1:443 ssl verify none check cookie 1
            server s2 10.10.10.2:443 ssl verify none check cookie 2
            server s3 10.10.10.3:443 ssl verify none check cookie 3
            server s4 10.10.10.4:443 ssl verify none check cookie 4
1 Like
haproxy -vv output:
--------------------------------------
HA-Proxy version 2.0.10-1ppa1~bionic 2019/11/26 - https://haproxy.org/
Build options :
  TARGET  = linux-glibc
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -O2 -fdebug-prefix-map=/build/haproxy-M3LRQ8/haproxy-2.0.10=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-format-truncation -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered -Wno-missing-field-initializers -Wno-implicit-fallthrough -Wno-stringop-overflow -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
  OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_ZLIB=1 USE_SYSTEMD=1

Feature list : +EPOLL -KQUEUE -MY_EPOLL -MY_SPLICE +NETFILTER -PCRE -PCRE_JIT +PCRE2 +PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +REGPARM -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H -VSYSCALL +GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 -MY_ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=2).
Built with OpenSSL version : OpenSSL 1.1.1  11 Sep 2018
Running on OpenSSL version : OpenSSL 1.1.1  11 Sep 2018
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with network namespace support.
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with PCRE2 version : 10.31 2018-02-12
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with the Prometheus exporter as a service

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
              h2 : mode=HTX        side=FE|BE     mux=H2
              h2 : mode=HTTP       side=FE        mux=H2
       <default> : mode=HTX        side=FE|BE     mux=H1
       <default> : mode=TCP|HTTP   side=FE|BE     mux=PASS

Available services :
        prometheus-exporter

Available filters :
        [SPOE] spoe
        [COMP] compression
        [CACHE] cache
        [TRACE] trace

Try adding ‘option redispatch’ to the backend.

We (I’m his teammate) tried adding ‘option redispatch’ to the backend but that didn’t seem to make an impact.

Do you think we need to add ‘option http-server-close’ ?