HAProxy community

External health check not updates with changes server ip address

We have a docker swarm setup in which we deploy to haproxy services (version 2.0). For our mariadb database services we use an external healtch check script because it needs to check the replication status of a galera cluster setup. When testing for failover we notice that the arguments haproxy sends to the health check script (https://cbonte.github.io/haproxy-dconv/2.0/configuration.html#external-check%20command) are not updated with new ip address (which happens when scaling down and up the mariadb service). The stats page and the show server state commands however ARE immediately updated to reflect the new ip address. After a very long time the ip arguments seem to be updated also for the external health check, but im wondering if this is a bug in haproxy or some particular timeout setting. To be clear we use a docker resolvers section so haproxy can query docker swarm for the correct ip’s. Below is our haproxy configuration :

    global
    log          fd@2 local2
    pidfile      /var/run/haproxy.pid
    maxconn      4000
    user         haproxy
    group        haproxy
    stats socket /var/lib/haproxy/stats expose-fd listeners
    external-check




resolvers docker
    nameserver dns 127.0.0.11:53
    resolve_retries 3
    timeout resolve 1s
    timeout retry   1s
    hold other      10s
    hold refused    10s
    hold nx         10s
    hold timeout    10s
    hold valid      10s
    hold obsolete   10s

defaults
    timeout connect 10s
    timeout client 30s
    timeout server 30s
    
    timeout tunnel 1h
    load-server-state-from-file none
    option httplog

    default-server init-addr libc,none


 frontend fe_web
    bind *:80
    mode http
    use_backend stat if { path -i /my-stats }
    default_backend be_php
    option http-keep-alive

    acl host_php hdr(host) -i testcluster.cregora2.nmbtz.com
    acl host_phpmyadmin hdr(host) -i testcluster.phpmyadmin.nmbtz.com
    acl host_portainer hdr(host) -i testcluster.nagios.nmbtz.com
    acl host_stats hdr(host) -i proxystats.simbuka.nmbtz.com

    use_backend be_php if host_php
    use_backend stat if host_stats
    use_backend be_phpmyadmin if host_phpmyadmin
    use_backend be_portainer if host_portainer

frontend fe_mysql
    bind *:3306
    default_backend be_mysql

frontend fe_redis
    bind *:6379
    mode tcp
    default_backend be_redis

backend be_php
    balance leastconn
    balance roundrobin
    mode http
    option httpchk
    stick-table  type binary  len 8  size 100k  expire 10s  store http_req_rate(10s)

    # Track client by base32+src (Host header + URL path + src IP)
    http-request track-sc0 base32+src

    # Check map file to get rate limit for path
    http-request set-var(req.rate_limit)  path,map_beg(/etc/haproxy/rates.map,20)

    # Client's request rate is tracked
    http-request set-var(req.request_rate)  base32+src,table_http_req_rate()

    # Subtract the current request rate from the limit
    # If less than zero, set rate_abuse to true
    acl rate_abuse var(req.rate_limit),sub(req.request_rate) lt 0

    # Deny if rate abuse
    http-request deny deny_status 429 if rate_abuse

    timeout connect 120s
    timeout server 120s

    acl is_api path -i -m beg /api

    errorfile 429 /etc/haproxy/errorfiles/too_many_requests.http if is_api
    errorfile 503 /etc/haproxy/errorfiles/server_not_found.http if is_api

    server-template php- 4 php:80 check resolvers docker init-addr last,libc,none maxconn 50

backend be_phpmyadmin
    balance roundrobin
    mode http
    server-template phpmyadmin- 2 phpmyadmin:80 check resolvers docker init-addr last,libc,none 
maxconn 20

backend be_portainer
    balance roundrobin
    mode http
    server-template portainer- 2 portainer:9000 check resolvers docker init-addr last,libc,none 
maxconn 20
    option forwardfor
    option http-keep-alive

backend be_mysql
    balance leastconn
    option external-check
    external-check path "/etc/haproxy:/bin"
    external-check command /var/lib/haproxy/galerahealthcheck.sh
    #option log-health-checks
    timeout check 2s
    server-template mysql-1- 1 mysql1:3306 check inter 10s resolvers docker init-addr libc,none maxconn 200
    server-template mysql-2- 1 mysql2:3306 check inter 10s resolvers docker init-addr libc,none maxconn 200
    server-template mysql-3- 1 mysql3:3306 check inter 10s resolvers docker init-addr libc,none maxconn 200


backend be_redis
    balance roundrobin
    timeout connect     60000
    timeout server      60000
    retries         3
    mode tcp
    option tcp-check
    tcp-check connect port 6379
    tcp-check send PING\r\n
    tcp-check expect string +PONG
    tcp-check send info\ replication\r\n
    tcp-check expect string role:master
    tcp-check send QUIT\r\n
    tcp-check expect string +OK
    server-template redis-1- 1 redis:6379 check port 6379 inter 2s resolvers docker init-addr libc,none maxconn 10
    server-template redis-2- 1 redis-slave:6379 check port 6379 inter 2s resolvers docker init-addr libc,none maxconn 10
    server-template redis-3- 1 redis-slave2:6379 check port 6379 inter 2s resolvers docker init-addr libc,none maxconn 10
    server-template redis-4- 1 redis-slave3:6379 check port 6379 inter 2s resolvers docker init-addr libc,none maxconn 10


 backend stat
    stats enable
    stats uri /my-stats
    stats refresh 15s
    stats show-legends
    stats show-node
    mode http

I added logging to the external health check, for example:


the ip address that was given in the arguments after the scaling back and up of the mariadb is the old ip from before it was downed. The stats and show server state command however immediately show the correct ip address…

Anybody?

Bumping this up as we also observer the same behaviour, any suggestions?

EDIT: in our case HAPROXY_SERVER_ADDR env variable is being set to the old IP address, even though we can clearly see in HAPROXY logs that it did indeed picked up the change from DNS:

2020-02-24T14:26:33+01:00 localhost haproxy-[REDACTED][26835]: bk_redis/bkredis-621 changed its IP from [REDACTED].90 to [REDACTED].37 by bk_dns/dns1.

Please file a bug at our Github issue tracker:

Also please confirm that the issue is also present when running with nbthread 1.