Haproxy 2.0.4 AWS ECS SRV failure

Hi, can’t figure out why trying to create server-template backend to SRV records is failing.

ERROR:
/usr/local/etc/haproxy # /docker-entrypoint.sh haproxy -f haproxy.cfg 
Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result FAILED
Total: 3 (2 usable), will use epoll.

Available filters :
	[SPOE] spoe
	[COMP] compression
	[CACHE] cache
	[TRACE] trace
Using epoll() as the polling mechanism.
[WARNING] 223/173344 (46) : [haproxy.main()] Cannot raise FD limit to 8225, limit is 4096.
[ALERT] 223/173344 (46) : sendmsg()/writev() failed in logger #1: No such file or directory (errno=2)
[WARNING] 223/173344 (46) : [haproxy.main()] FD limit (4096) too low for maxconn=4096/maxsock=8225. Please raise 'ulimit-n' to 8225 or more to avoid any trouble.
[NOTICE] 223/173344 (46) : New worker #1 (47) forked
[WARNING] 223/173344 (47) : Server sm2_backend/cr-sm21 is DOWN, reason: Socket error, check duration: 0ms. 3 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 223/173345 (47) : Server sm2_backend/cr-sm2 is DOWN, reason: Socket error, check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 223/173345 (47) : Server sm2_backend/cr-sm23 is DOWN, reason: Socket error, check duration: 0ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 223/173346 (47) : Server sm2_backend/cr-sm24 is DOWN, reason: Socket error, check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 223/173346 (47) : backend 'sm2_backend' has no server available!

here is config:

/usr/local/etc/haproxy # cat haproxy.cfg 
global
  daemon
  log /dev/log    local0
  log /dev/log    local1 notice
  maxconn 4096
  tune.ssl.default-dh-param 2048

defaults
  log               global
  retries           3
  maxconn           2000
  timeout connect   5s
  timeout client    50s
  timeout server    50s

resolvers awsvpc
  nameserver vpc 10.0.0.2:53

listen stats
  bind 0.0.0.0:9090
  balance
  mode http
  stats uri /
  stats enable
  stats auth admin:admin

frontend http_in
  bind *:80
  mode http
  default_backend sm2_backend

backend sm2_backend
  mode http
  #balance hdr(X-User-ID)
  option httpchk GET / HTTP/1.1
  http-check expect status 404
  server-template srv 4 _ostest._tcp.ostest-sm2.awsvpc-private check resolvers awsvpc resolve-opts allow-dup-ip

Dig results:
/usr/local/etc/haproxy # dig srv _ostest._tcp.ostest-sm2.awsvpc-private @10.0.0.2

; <<>> DiG 9.14.3 <<>> srv _ostest._tcp.ostest-sm2.awsvpc-private
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26799
;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_ostest._tcp.ostest-sm2.awsvpc-private. IN SRV

;; ANSWER SECTION:
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 36166 77115130-2b78-404b-83d2-902e0fa6a90b._ostest._tcp.ostest-sm2.awsvpc-private.
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 35929 f560ab63-596b-4957-a04c-9723e3429f5c._ostest._tcp.ostest-sm2.awsvpc-private.
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 36033 817750e0-186a-4df5-9355-dd5c1a89445d._ostest._tcp.ostest-sm2.awsvpc-private.
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 36334 fa2408b8-e5f6-439d-b3a6-23069f6965b8._ostest._tcp.ostest-sm2.awsvpc-private.
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 36166 3176554e-a119-4582-9adf-db934a24739b._ostest._tcp.ostest-sm2.awsvpc-private.
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 36079 c69c5da8-37cc-4d3c-b77d-377f86faa8f8._ostest._tcp.ostest-sm2.awsvpc-private.
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 35981 572a7129-9797-4342-8280-e96a1e34c0a9._ostest._tcp.ostest-sm2.awsvpc-private.
_ostest._tcp.ostest-sm2.awsvpc-private. 5 IN SRV 1 1 34334 e1114679-6e68-4397-af68-1d50eba559c4._ostest._tcp.ostest-sm2.awsvpc-private.

;; Query time: 2 msec
;; SERVER: 10.0.0.2#53(10.0.0.2)
;; WHEN: Mon Aug 12 17:20:56 UTC 2019
;; MSG SIZE  rcvd: 906

CURL to service endpoint:

/usr/local/etc/haproxy # curl 77115130-2b78-404b-83d2-902e0fa6a90b._ostest._tcp.ostest-sm2.awsvpc-private:36166 -I
HTTP/1.1 404 NOT FOUND
Content-Type: application/problem+json
Content-Length: 206
X-App-Version: None
Connection: Keep-Alive

netcat from haproxy to upstream target:

/usr/local/etc/haproxy # nc -vz 77115130-2b78-404b-83d2-902e0fa6a90b._ostest._tcp.ostest-sm.awsvpc-private 36166
77115130-2b78-404b-83d2-902e0fa6a90b._ostest._tcp.ostest-sm2.awsvpc-private (10.0.0.84:36166) open

You most likely need to raise accepted_payload_size. The default is 512 and your response seems to be at least 906 bytes, so that’s why DNS resolution cannot work.

However, you should also fix the startup warnings. If you configure maxconn 4096, also start haproxy as root so it can raise ulimit, or raise ulimit yourself. Otherwise, lower maxconn so that it isn’t needed.

Additionally, logging appears to fail, so that is something you also need to look at (in docker you probably want to configure stdout/sterr logging. 1.9 supports this, check the log directive documentation.

1 Like

Thanks for the feedback! actually if i had removed that logline it logs to stdoutt and comes up in cloudwatch, i just added this for troubleshooting period with attempt to see a local file, but didnt seem to work.

The dns packet size did the trick…many thanks!!!

1 Like