Trouble with DNS Resolvers sticking to single IP address

If anyone wants to try out 1.8-dev2 dns resolvers, I have setup a docker demo:

You would probably need to reconfigure the hard coded IP addresses. Will get around to making it env based soon.

Asking @willy to release 1.8-dev3, so that all those changes can be easier tested in the field.

Asking myself the same question.

From the Docker Docs:

To bypass the routing mesh, you can start a service using DNS Round Robin (DNSRR) mode, by setting the --endpoint-mode flag to dnsrr. You must run your own load balancer in front of the service. A DNS query for the service name on the Docker host returns a list of IP addresses for the nodes running the service. Configure your load balancer to consume this list and balance the traffic across the nodes.

How do I actually do that with HAProxy?

Well, either you set multiple server line with the same name:
backend myapp

[…]

server s1 myapp.domain.com:80 check resolvers mydns

server s2 myapp.domain.com:80 check resolvers mydns

server s3 myapp.domain.com:80 check resolvers mydns

Or you use server-template directive:

backend myapp

[…]

server s 3 myapp.domain.com:80 check resolvers mydns

Adjust the number of servers to your need.

In each case, the resolvers should turn on one server per IP found in the response.

Baptiste, I tried to use multiple servers but the requests are not balanced evenly between the servers.

I built an example using Docker and docker-compose. The backend servers count each request they receive and print the number of requests received after 60 seconds.

Code:

The output is

api_4      | f8e1414ea551 0
api_1      | 860eb040651c 0
api_2      | 6af96d901ea8 179
api_5      | 1f15abd0d461 60
api_3      | 271ae04ff5cc 60

One API server receives 180 requests, two receive 60 each and another two receive 0.

Is it possible to round robin to servers which were resolved with DNS?

Did you start 5 instances of ‘api’ service?

Yes. Five instances of the API service using.

docker-compose up --scale api=5

Hi,

You’re missing the “resolvers docker” statement on your server-template line.

With this enabled, I have the following result:

docker-compose up --scale api=5

Starting debug_api_1 …

Starting debug_api_1 … done

Starting debug_api_2 … done

Starting debug_api_3 … done

Starting debug_api_4 … done

Starting debug_api_5 … done

Attaching to debug_haproxy_1, debug_api_1, debug_api_2, debug_api_3, debug_api_4, debug_api_5

api_3 | cd51492adab9 63

api_4 | f5532b40ea80 61

debug_api_3 exited with code 0

debug_api_4 exited with code 0

api_2 | a81602129221 61

api_1 | f8202d903d1b 61

debug_api_2 exited with code 0

debug_api_1 exited with code 0

api_5 | 02f56990bbd4 62

debug_api_5 exited with code 0

You are correct @Baptiste. Apologies!

Thank you so much @z0mb1ek and @baptiste for explaining this I never would have guessed I need multiple server lines!

I’ve spent all weekend trying to figure out why haproxy wouldn’t load balance my docker service even though simple curls show it cycling among the different ip’s that my service name resolves to.

In my case I don’t know how large my service will be scaled. How do I know how many duplicate service lines to use? Is there any downside to using whatever maximum I expect?

What is the upcoming better way to do this is there a github issue I can follow or blog post explaining it?

I’ve right now implementing something similar on our infrastructure.

In our case, we have it pointing at an AWS ALB. Since those IPs can change, we don’t want it to hold onto that IP forever.

The final configuration looks something like this:

resolvers default
  parse-resolv-conf
  timeout resolve 1m

backend be_gw
  mode http
  http-request set-header host aws-gw.contoso.com
  option httpchk GET /srv/status HTTP/1.1
  http-check send hdr "host" "aws-gw.contoso.com"

  default-server init-addr none resolvers default check downinter 2s fastinter 2s inter 3m ssl ca-file /usr/local/etc/haproxy/ca-certificates.crt
  server-template gw4- 1-3 aws-gw.contoso.com:443 backup resolve-prefer ipv4
  server-template gw6- 1-3 aws-gw.contoso.com:443 resolve-prefer ipv6

Here’s the options applied to these servers and why:

  • init-addr none: the server starts off blank and then gets populated by the DNS lookup
  • resolvers default: use the “default” resolver group defined at the top of the file
  • ssl ca-file …: use TLS and validate with the specified CA file
  • resolve-opts prevent-dup-ip: one IP per server
  • backup: only use IPv4 as backup; this is the year of BOTH the Linux desktop and IPv6
  • resolve-prefer ipv[46]: use these addresses for these servers

The combination of these options gives us the following internal configuration:

'show servers state be_gw' | socat STDIO UNIX-CONNECT:/tmp/haproxy.sock
1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port srv_check_addr srv_agent_addr srv_agent_port
5 be_gw 1 gw4-1 192.0.2.53 2 0 1 1 5 1 0 0 0 0 0 0 aws-gw.contoso.com 443 - 1 0 - - 0
5 be_gw 2 gw4-2 198.51.100.42 2 0 1 1 5 1 0 0 0 0 0 0 aws-gw.contoso.com 443 - 1 0 - - 0
5 be_gw 3 gw4-3 203.0.113.60 2 0 1 1 5 1 0 0 0 0 0 0 aws-gw.contoso.com 443 - 1 0 - - 0
5 be_gw 4 gw6-1 2001:db8:0:1:bf56:d653:4d5f:5254 2 0 1 1 5 1 0 0 0 0 0 0 aws-gw.contoso.com 443 - 1 0 - - 0
5 be_gw 5 gw6-2 2001:db8:0:2:37ce:beb1:dbd1:78a6 2 0 1 1 5 1 0 0 0 0 0 0 aws-gw.contoso.com 443 - 1 0 - - 0
5 be_gw 6 gw6-3 2001:db8:0:3:92b5:98b9:a514:9f8e 2 0 1 1 5 1 0 0 0 0 0 0 aws-gw.contoso.com 443 - 1 0 - - 0

Note that we’re using two separate sections for IPv4 and IPv6 addresses - this is due to resolve-prefer (default: ipv6) causing only IPv6 addresses to be used (tested on 2.5.7)