Haproxy 1.8.2; 1.8.3 DNS auto discover stop working

resolvers dns1
nameserver dns 172.30.0.2:53
resolve_retries 3
timeout retry 1s
hold valid 10s
accepted_payload_size 8192

global
maxconn 41666
nbproc 6

pidfile /var/run/haproxy.pid
stats socket /var/run/haproxy-1.sock uid 0 gid 0 mode 0666 level admin process 1
stats socket /var/run/haproxy-2.sock uid 0 gid 0 mode 0666 level admin process 2
stats socket /var/run/haproxy-3.sock uid 0 gid 0 mode 0666 level admin process 3
stats socket /var/run/haproxy-4.sock uid 0 gid 0 mode 0666 level admin process 4
stats socket /var/run/haproxy-5.sock uid 0 gid 0 mode 0666 level admin process 5
stats socket /var/run/haproxy-6.sock uid 0 gid 0 mode 0666 level admin process 6

user haproxy
group haproxy
daemon
quiet
tune.ssl.default-dh-param 2048
spread-checks 5s
log 127.0.0.1 local0 info

tune.bufsize 10000
tune.maxaccept -1

defaults
mode http

log global
option httplog
option dontlog-normal

Forward request headers from the original client to the backend

option forwardfor

These make a big difference in the rate at which we can process requests

option tcp-smart-connect
option tcp-smart-accept
option splice-auto

Set pretty aggressive timeouts

timeout connect 5s
timeout client 30s
timeout server 30s
timeout queue 5s
timeout http-request 30s
timeout http-keep-alive 60s

keep backend connections alive

no option http-server-close

Retry if you can’t connect immediately

option redispatch 1 # redispatch on every retry
retries 3

frontend http-in
bind *:80
mode http

maxconn 31249
backlog 62498

compression algo gzip

acl is_204 status eq 204
acl no_content hdr_val(content-length) lt 1
http-response del-header Content-Length if is_204
http-response del-header Content-Type if is_204
http-response del-header Content-Type if no_content

default_backend *-us-east-1

frontend https-in
bind :443 ssl crt /etc/haproxy/star_.pem
reqadd X-Forwarded-Proto:\ https

maxconn 10416
backlog 20832

acl is_204 status eq 204
acl no_content hdr_val(content-length) lt 1
http-response del-header Content-Length if is_204
http-response del-header Content-Type if no_content

default_backend *-us-east-1

backend *-us-east-1
balance leastconn

http-reuse aggressive

http-check expect status 200
option httpchk GET /status_server

server-template -rr 70 _http._tcp.us-east-1.com resolvers dns1 resolve-prefer ipv4 check inter 1s downinter 60s fall 3 weight 1

listen stats
bind *:1936
stats enable
bind-process 1
stats uri /stats
stats hide-version
stats auth :
stats refresh 30s

  1. Enabled; there is nothing interesting, just like new servers don’t come up
  2. Done
  3. Not using nbthread
  4. no reload, no restart
  5. I was wrong. recovers on restart(should check reload)
  6. gets worse over time, was better when there was no more than 25-30 servers
  7. old servers just been terminated