HAProxy fails to start if backend server names don't resolve


#1

I’m trying to use HAProxy 1.6.5 w/Docker 1.11.1 to proxy to mysql servers. Config file has the following:

resolvers docker
  nameserver dns 127.0.0.11:53

listen mysql-global
  bind :3306
  server db-global db-global:3306 check resolvers docker resolve-prefer ipv4

When I start HAProxy without having db-global running yet (and therefore have its name resolve), HAProxy exits with error:

[ALERT] 148/192216 (225) : parsing [/etc/haproxy.cfg:42] : 'server db-global' : invalid address: 'db-global' in 'db-global:3306'

How do I configure HAProxy to not error out, but to just keep checking the docker internal DNS until ‘db-global’ resolves to an IP and then use that IP to do the check to see if the backend server is UP?

I want to avoid having to start all my containers in some strict order to bring the app up with Docker. My app’s dependencies are statically defined so I don’t want to bother starting HAProxy with an empty config and then reloading a config after the dependencies are UP and running.

Or, is reloading the config my only option?

It would be easier if DNS names that don’t resolve simply mark that server DOWN and keep checking until it does resolve and the check indicates that the server is UP.


#2

While I don’t have an answer to your real question right now; by advised that dns resolution is pretty much broken in haproxy 1.6.5. Either use 1.6.4 or something more recent (a current git or tar snapshot containing the fix “BUG/MEDIUM: dns: unbreak DNS resolver after header fix”).


#3

I tried 1.6.4, 1.6.3, 1.6.2, 1.6.1 (out of the Docker Hub) and no version supports DNS names that don’t resolve on haproxy config parsing.

Without a fix for this, it is much harder to use haproxy in production with a static configuration file where the proxied containers might not have been started yet when haproxy launches.

Are the maintainers of HAProxy aware of this problem? It doesn’t seem to be the same issue as you pointed out with the 1.6.5 regression.


#4

not sure if this is expected behavior, @Baptiste?


#5

Any answer to this? Is this expected behavior? I’ll need to figure out a work-around, if so. Hope it is just a bug that will be resolved shortly…


#6

Just tried 1.6.6 and no change in behavior… HAProxy still errors out if a DNS name doesn’t resolve when parsing the config file. This means I need to control the startup order of my containers in Docker to make sure that all the proxied servers have started before starting HAProxy.

Here is the parse error I get with 1.6.6:

$ haproxy -f /etc/haproxy.cfg

[ALERT] 182/230151 (323) : parsing [/etc/haproxy.cfg:48] : ‘server proxy-sql3’ : invalid address: ‘proxy-sql3’ in ‘proxy-sql3:3306’

[ALERT] 182/230151 (323) : Error(s) found in configuration file : /etc/haproxy.cfg
[ALERT] 182/230151 (323) : Fatal errors found in configuration.


#7

Hi,

This is expected behavior for now.
We’ll improve this soon by adding a “init-addr” feature that will allow to force an IP address at start-up. More in August about it.


#8

Okay. Not really sure why this is “expected behavior” when the backend servers have status checks that will mark the server as down if it fails a check. If the DNS name lookup fails, I would think the server should just be marked down until the DNS name lookup succeeds and the subsequent check succeeds.

Anyway, hopefully your “init-addr” feature will allow specifying an invalid IP so the checks will all fail until the DNS name lookup succeeds and a real set of IP addresses is returned. Seems like a real kludge to me and not a long term solution. But, maybe I don’t understand the proposed feature. In Docker, these DNS names are “service” names and the “service” may have many IPs that are load balanced by Docker’s DNS internal resolver. So, the set of IPs that are returned on a lookup may be empty if all backend servers have failed and no longer exist. In this case, I hope HAProxy checks still consider the backend server to be down and keep checking until the service comes back online. Seems to me the fatal error about the name not resolving should be changed to log a warning and an invalid IP is assigned to the server internally without exposing the invalid IP to the user in the form of an “init-addr” feature.

I’ve worked around this in my Docker containers that use HAProxy, for now, by checking that the config file parses before starting HAProxy. I simply loop in a script for 5 minutes checking the config file every 10 seconds until it parses. This is less than ideal, but it might solve my problem for HA deployment of HAProxy (at least for proxy startup).


#9

I have the exact same problem. I want to use HAProxy to do routing by URL-path into my microservices running in a Docker 1.12.x swarm.

So .www.example.com/service1/* will route to one Docker swarm service and .www.example.com/service2/* will route do another service.

I will potentially have a lot of different microservices, and of cause it is not acceptable for me that I cant start HAProxy if one of these many services are down.

I think the best solution is to simply create an configuration option that will make it possible to disable the check for an invalid DNS lookup. At runtime HAProxy could then return e.g. a 500 if unable to route to the unavailable service.

Why is this not an option? It should af cause be validate the address by default.


#10

It looks like nginx can do what I need:
.https://gist.github.com/soheilhy/8b94347ff8336d971ad0

I will try to set it up and use nginx instead of HAProxy


#11

Hi Peer,

It’s your choice and as long as it matches your needs, I’m happy for you…
You’ll hit other limits with nginx, but I let you discover them :wink:

Note that this feature should be available by today in HAProxy (dev branch)
and I have to take a day off to find some time to develop it.

Baptiste


#12

Hi Baptiste,

we just run into the same problem that the DNS name can not be resolved on HAProxy startup and therefore it fails to parse the config file with that invalid address error. (Also in some kind of a docker setup as “bech” mentioned above.)

We would need the feature you described in your previous post. I had a look at the dev branch from the haproxy Git repository but unfortunately i was not able to find the corresponding commit which includes that specific feature.

Could you point me to that commit(s)? How long do think it will take for this feature to be in a haproxy release version?

Kind regards,
bitmover


#13

Hi,

Willy is about to push it today or tomorrow.
Look for init-addr.

Baptiste


#14

Actually, it’s already pushed!

You can enjoy it.

Baptiste


#15

Is this perhaps the same issue I have, I use aws opsworks and haproxy as a loadbalancer.

In haproxy 1.7 i have this:
backend api
description test api
balance roundrobin
cookie JSESSIONID prefix nocache
option forwardfor
option httpchk get /api/ping
http-check expect status 200
http-response add-header x-backend %s
http-response set-header Access-Control-Allow-Origin "*"
server api1 v-api-s1.xxxx.aws:8080 check cookie s1
server api2 v-api-s2.xxxx.aws:8080 check cookie s2
server api3 v-api-s3.xxxx.aws:8080 check cookie s3
server api4 v-api-s4.xxxx.aws:8080 check cookie s4
server api5 v-api-s5.xxxx.aws:8080 check cookie s5

We have in total 50 servers in that list, and opsworks scales up and down on random times.
and it is so automated that if it shutdown a server it removes it from also from the dns (route53), and adds it back on startup.

I keep getting the error:
[ALERT] 336/141922 (1565) : parsing [/etc/haproxy1.7/haproxy.cfg:58] : ‘server api5’ : could not resolve address ‘v-api-s5.xxxx.aws’.
[ALERT] 336/141922 (1565) : Failed to initialize server(s) addr.

How can I resolve this situation? should it not simply start since I provide a health check and not only look at that?


#16

Hi,

You need to start with an init-addr parameter set to none and enable
runtime DNS resolution.
https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#init-addr

Baptiste


#17

Thank you so much that is what I was missing!

I added:
resolvers binddns
nameserver dns1 x.x.x.x:53
nameserver dns2 x.x.x.x:53
resolve_retries 3
timeout retry 1s
hold other 30s
hold refused 30s
hold nx 30s
hold timeout 30s
hold valid 10s

global
defaults
default-server init-addr none

and that resolved the problem it is all working very good now!


#18

Which release in 1.6.x line contains the “init-addr” feature ?
Which release in 1.7.x line contains the “init-addr” feature ?
Thanks


#19

All 1.7 releases support init-addr.

It’s not supported in 1.6 though.


#20

I tried this in 1.7.9

default-server init-addr libc,none

while i definitely see a change in behavior, when the server it is trying to connect to comes up, haproxy still fails to connect. Is it because it is “disabling server”?

[WARNING] 304/124345 (8) : parsing [/usr/local/etc/haproxy/haproxy.cfg:39] : ‘server q1’ : could not resolve address ‘gporev-queue1’, disabling server.
gporev_gporev-queue.1.9umcjoy7k86e@c3cueat2 | [WARNING] 304/124345 (8) : parsing [/usr/local/etc/haproxy/haproxy.cfg:40] : ‘server q2’ : could not resolve address ‘gporev-queue2’, disabling server.
gporev_gporev-queue.1.9umcjoy7k86e@c3cueat2 | [WARNING] 304/124345 (8) : parsing [/usr/local/etc/haproxy/haproxy.cfg:41] : ‘server q3’ : could not resolve address ‘gporev-queue3’, disabling server.
gporev_gporev-queue.1.9umcjoy7k86e@c3cueat2 | Nov 1 12:43:45 921a578ad78f local0.notice 921a578ad78f haproxy[8]: Proxy stats started.
gporev_gporev-queue.1.9umcjoy7k86e@c3cueat2 | Nov 1 12:43:45 921a578ad78f local0.notice 921a578ad78f haproxy[8]: Proxy gporev-queue started.
gporev_gporev-queue.1.9umcjoy7k86e@c3cueat2 | Nov 1 12:44:54 921a578ad78f local0.info 921a578ad78f haproxy[9]: 10.0.2.27:52202 [01/Nov/2017:12:44:54.922] gporev-queue gporev-queue/ -1/-1/0 0 SC 0/0/0/0/0 0/0
gporev_gporev-queue.1.9umcjoy7k86e@c3cueat2 | Nov 1 12:44:54 921a578ad78f local0.info 921a578ad78f haproxy[9]: 10.0.2.27:52206 [01/Nov/2017:12:44:54.953] gporev-queue gporev-queue/ -1/-1/0 0 SC 0/0/0/0/0 0/0