Haproxy HA to Redis , server closed the connection

I have an haproxy that is a frontend to Redis. Things connect fine, but wen the connection has sat idle for for a minute or so, when you enter the next redis command you get
“server closed the connection” . If I connect directly to the redis this does not happen, please help with suggested haproxy changes that could help

example redis-cli connection - redis-cli -p 7777
example redis-cli direct connection - redis-cli -p 17777

haproxy -vv
HA-Proxy version 1.8.24 2020/02/15
Copyright 2000-2020 Willy Tarreau willy@haproxy.org

Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-unused-label
OPTIONS = USE_LINUX_TPROXY=1 USE_CRYPT_H=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_SYSTEMD=1 USE_PCRE=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2k-fips 26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-fips 26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.7
Running on zlib version : 1.2.7
Compression algorithms supported : identity(“identity”), deflate(“deflate”), raw-deflate(“deflate”), gzip(“gzip”)
Built with network namespace support.

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace

/etc/haproxy/haproxy.cfg

global
log 127.0.0.1 local1
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
nbproc 1
nbthread 4
cpu-map auto:1/1-4 0-3
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
maxconn 40000

defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http

frontend http
bind :8080
default_backend stats

backend stats
mode http
stats enable
stats uri /
stats refresh 10s
stats show-legends
stats admin if TRUE

listen redis7777
bind *:7777
maxconn 40000
mode tcp
balance first
option tcplog
option tcp-check
timeout connect 10s
timeout client 1800s
tcp-check send PING\r\n
tcp-check expect string +PONG
tcp-check send info\ replication\r\n
tcp-check expect string role:master
tcp-check send QUIT\r\n
tcp-check expect string +OK
server REDIS1A 10.1.1:17777 maxconn 20000 check inter 2s
server REDIS1B 10.1.1.2:17777 maxconn 20000 check inter 2s
server REDIS1C 10.1.1.3:17777 maxconn 20000 check inter 2s

Hello,

I had the same issue. The explanation I found is the following:
The timeout client detects a dead client application on a responsive client OS. You can always have an application that occupies a connection but doesn’t speak to you. This is bad because the number of connections isn’t infinite (maxconn).

So basically you need to set a value for the timeout client that you think is valid for your use case, and the issue should be solved.

Hope this help.

Manu

I found out more information on this

From version 3.2 onwards, Redis has TCP keepalive (SO_KEEPALIVE socket option) enabled by default and set to about 300 seconds.

So i am now setting these two parameters to 360 seconds
timeout client 360s
timeout server 360s

Now the problem never occurs with clients that have a keepalive under 300 seconds