I am trying to implement some cookie-based session stickiness with HAProxy inside of K8S cluster. I am using 2.0.2-alpine image.
backend dummy-api mode http option log-health-checks option httpchk GET /isalive dynamic-cookie-key XXXXX cookie SESSION_COOKIE rewrite nocache dynamic balance roundrobin option httpclose server-template srv-ns 8 _http-api-port._tcp.dummy-api-service.default.svc.cluster.local resolvers k8s check check inter 10s downinter 20s fastinter 5s resolve-opts allow-dup-ip
I am observing really odd behavior from HAProxy - it constantly reevaluates the state of this backend and brings the servers up and down every couple of seconds. No server stays around for more than 1-2 minutes. Not to mention that I have 8 pods (all alive and well) and HAProxy sees only 5 or 6 out of them. I see the following pattern in the debug logs:
srv-ns3 changed its FQDN from (null) to api-4.dummy-api-service.default.svc.cluster.local by 'SRV record' srv-ns4 changed its FQDN from (null) to api-4.dummy-api-service.default.svc.cluster.local by 'SRV record' srv-ns3 is going DOWN for maintenance (No IP for server ). 5 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. .... srv-ns3 changed its IP from 10.1.1.108 to 10.1.1.109 by DNS cache. Server dummy-api-service-nosession/srv-ns3 ('api-4.dummy-api-service.default.svc.cluster.local') is UP/READY (resolves again). Server dummy-api-service-nosession/srv-ns3 administratively READY thanks to valid DNS answer. dummy-api-service-nosession/srv-ns3 changed its IP from 10.1.1.108 to 10.1.1.109 by DNS cache. ...
I can assure you that the pods themselves are alive and well. In fact, I have attempted an alternativel configuration using just 8 “server” lines - all 8 are green 100% and never go down.
There is something odd about it. I have noticed that the order of the SRV records constantly changes in K8S - but this is expected, the order is not guaranteed in DNS anyway.
P.S. Attempted the same configuration with version 1.9 - same result. When using service-template, the list of available servers constantly changes, they go up and down and some of them never get to UP state.