Balance issue: redis failover. How to stay on master with longest available time?

Hi,

I am using redis with one master and one slave. I want to use haproxy to do the failover.
I must only forward data to the current master.

This is my current config:

frontend ft_redis
    bind 0.0.0.0:16380 name redis ssl crt /etc/haproxy/certs/loadbalancer_all_in_one.pem
    default_backend bk_redis

# Specifies the backend Redis proxy server TCP health settings
# Ensure it only forward incoming connections to reach a master.
backend bk_redis
    option tcp-check
    tcp-check connect
    tcp-check send AUTH\ qay\r\n
    tcp-check expect string +OK
    tcp-check send PING\r\n
    tcp-check expect string +PONG
    tcp-check send info\ replication\r\n
    tcp-check expect string role:master
    tcp-check send QUIT\r\n
    tcp-check expect string +OK

    server elastic01.internal.dtpublic.de_7000 localhost4:7000 check inter 2s
    server elastic02.internal.dtpublic.de_7001 localhost4:7001 check inter 2s

So only the server which tells me, it is master, should get the traffic.

I have following issue:

  • server1 is master, server2 is slave
  • stop redis on server 1
  • server1 is unavailable
  • both servers are marked red in haproxy
  • sentinel is doing the failover
  • server 2 is master
  • haproxy shows green for server2.
  • starting redis on server1
  • server1 is starting as master (because it was master before the crash / shutdown)
  • haproxy is showing both servers as master
  • when I throw requests against haproxy, it is distributing requests to both servers, looks like round robin.
  • after some seconds sentinel will tell server1 to be slave
  • server1 will be shown as red.
  • server1 will get replication of server2
  • all changes done to server1 during the time where I had two masters are lost.

I just searched a lot, but I did not find a way to improve sentinel behavior (i.e. always ask sentinel who the master is).

Is there a way to tell haproxy to forward the traffic to the server, which has the longest uptime? This should workaround this behavior.

Thanks, Andreas

No, that’s not possible. Even if you are able to get an uptime indication from the backend servers, vanilla haproxy health checks cannot make comparisons between protocol level return data from different health checks.

You could write external health checks and implement this on our own (see external-check docs).

But I think that would be very wrong. I don’t know anything about redis, but I assume you are not the first one to run a redis cluster, and I do not believe for one second that this is a problem that needs to be solved on a externally on a load-balancer. There must be a better way to solve this at redis level itself.