Server-template and randomized DNS responses


#1

Hi!

I have a Consul as service discovery tool and HAProxy as load balancer.

In Consul registered a service running on a number of servers, and this service can be scaled by adding and removing nodes and by moving nodes from one server to another.

Consul has DNS service which randomizes responses for services like that:

[bux] michep@bux:~$ dig +short mfm-monitor-opentsdb.service.mfmconsul
10.182.161.239
10.182.161.152
10.182.161.240
10.182.161.92
[bux] michep@bux:~$ dig +short mfm-monitor-opentsdb.service.mfmconsul
10.182.161.92
10.182.161.152
10.182.161.240
10.182.161.239

In HAProxy 1.8.3 im using server-template configuration, like that:

resolvers dns
  nameserver dns1 ${HAPROXY_NAMESERVER}
  hold valid 2s

backend tsdb_backend_query
  server-template tsdb_query 5 mfm-monitor-opentsdb.service.mfmconsul:4242 check resolvers dns inter 1000

And in that case I get alot of warinings in haproxy log:

time="2018-02-02T15:44:32+03:00" level=info msg="[WARNING] 032/154432 (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.240 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:44:42+03:00" level=info msg="[WARNING] 032/154442 (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.239 to 10.182.161.240 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:44:46+03:00" level=info msg="[WARNING] 032/154446 (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.152 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:44:50+03:00" level=info msg="[WARNING] 032/154450 (32983) : tsdb_backend_query/tsdb_query2 changed its IP from 10.182.161.92 to 10.182.161.152 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:44:52+03:00" level=info msg="[WARNING] 032/154452 (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.239 to 10.182.161.92 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:44:56+03:00" level=info msg="[WARNING] 032/154456 (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.240 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:45:00+03:00" level=info msg="[WARNING] 032/154500 (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.92 to 10.182.161.240 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:45:02+03:00" level=info msg="[WARNING] 032/154502 (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.240 to 10.182.161.92 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:45:04+03:00" level=info msg="[WARNING] 032/154504 (32983) : tsdb_backend_query/tsdb_query2 changed its IP from 10.182.161.152 to 10.182.161.240 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:45:06+03:00" level=info msg="[WARNING] 032/154506 (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.239 to 10.182.161.152 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:45:10+03:00" level=info msg="[WARNING] 032/154510 (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.92 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:45:18+03:00" level=info msg="[WARNING] 032/154518 (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.239 to 10.182.161.92 by DNS cache." job=mfm-monitor-haproxy pid=32983 
time="2018-02-02T15:45:20+03:00" level=info msg="[WARNING] 032/154520 (32983) : tsdb_backend_query/tsdb_query2 changed its IP from 10.182.161.240 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy pid=32983 

This isn’t really break the service, but I think this is not quite normal.

Any advise on how to resolve this issue?


#2

Hi

You’re not using SRV records and that may be the root cause of your issue.
Please try something like this:

backend tsdb_backend_query
server-template tsdb_query 5
_mfm-monitor-opentsdb._tcp.service.mfmconsul:4242 check resolvers dns
inter 1000

if “mfm-monitor-opentsdb” is your service name in consul.


#3

Hi

I’ve changed configuration as you suggested:

backend tsdb_backend_query
  server-template tsdb_query 5 _mfm-monitor-opentsdb._tcp.service.mfmconsul:4242 check resolvers dns inter 1000

Logs are kinda different - backend servers now go UP and DOWN, but seems the same - ip addresses changing in the same way:

time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 (18208) : Server tsdb_backend_query/tsdb_query1 is going DOWN for maintenance (No IP for server ). 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 (18208) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.223 to 10.182.161.211 by DNS cache." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 (18208) : Server tsdb_backend_query/tsdb_query1 administratively READY thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 (18208) : Server tsdb_backend_query/tsdb_query1 ('0ab6a1d3.addr.dc1.mfmconsul') is UP/READY (resolves again)." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 (18208) : Server tsdb_backend_query/tsdb_query3 is going DOWN for maintenance (No IP for server ). 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 (18208) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.98 to 10.182.161.223 by DNS cache." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 (18208) : Server tsdb_backend_query/tsdb_query3 administratively READY thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 (18208) : Server tsdb_backend_query/tsdb_query3 ('0ab6a1df.addr.dc1.mfmconsul') is UP/READY (resolves again)." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 (18208) : Server tsdb_backend_query/tsdb_query3 is going DOWN for maintenance (No IP for server ). 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 (18208) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.223 to 10.182.161.98 by DNS cache." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 (18208) : Server tsdb_backend_query/tsdb_query3 administratively READY thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 (18208) : Server tsdb_backend_query/tsdb_query3 ('0ab6a162.addr.dc1.mfmconsul') is UP/READY (resolves again)." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 (18208) : Server tsdb_backend_query/tsdb_query1 is going DOWN for maintenance (No IP for server ). 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 (18208) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.211 to 10.182.161.223 by DNS cache." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 (18208) : Server tsdb_backend_query/tsdb_query1 administratively READY thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 (18208) : Server tsdb_backend_query/tsdb_query1 ('0ab6a1df.addr.dc1.mfmconsul') is UP/READY (resolves again)." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:13:05+03:00" level=info msg="[WARNING] 038/021305 (18208) : Server tsdb_backend_query/tsdb_query2 is going DOWN for maintenance (No IP for server ). 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy pid=18208
time="2018-02-08T02:13:05+03:00" level=info msg="[WARNING] 038/021305 (18208) : tsdb_backend_query/tsdb_query2 changed its IP from 10.182.161.163 to 10.182.161.211 by DNS cache." job=mfm-monitor-haproxy pid=18208

Any thoughts?