We recently replaced an old haproxy system running CentOS7 and haproxy 1.8.26 (I know… ancient) with new hardware, rhel9 and haproxy 2.4.22-f8e3218, which is what the repo is providing as the version to install.
Since the upgrade, we have seen our errors on some VIP’s increase exponentially. The errors are from a TCP RST being sent from the haproxy system to the client and breaking the connection. There is no evidence or log entries that the backend had any issue during this time and it has been verified in a tcpdump, the backend server doesn’t send a TCP RST.
We have poured over sysctl settings and haproxy settings trying to find what is causing this but have yet to discover the root cause.
A reload of haproxy causes the errors to decrease for a time and then as time continues, the errors become more frequent. The issue is prevalent on mysql VIPs as well as HTTP VIPs. I turned up the logging to info and still didn’t find any useful logs to tell me anything.
If anyone has thought around what it could be, I will go investigate whatever it is. At this point, I’m lost as to how to determine what the underlying cause is.
haproxy -vv
HAProxy version 2.4.22-f8e3218 2023/02/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-2.4.22.html
Running on: Linux 5.14.0-570.17.1.el9_6.x86_64 #1 SMP PREEMPT_DYNAMIC Fri May 23 22:47:01 UTC 2025 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = cc
CFLAGS = -O2 -g -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
OPTIONS = USE_PCRE2=1 USE_LINUX_TPROXY=1 USE_CRYPT_H=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_SYSTEMD=1 USE_PROMEX=1
DEBUG =
Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL +EPOLL -EVPORTS +FUTEX +GETADDRINFO -KQUEUE +LIBCRYPT +LINUX_SPLICE +LINUX_TPROXY +LUA -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OT -PCRE +PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PRIVATE_CACHE -PROCCTL +PROMEX -PTHREAD_PSHARED -QUIC +RT +SLZ -STATIC_PCRE -STATIC_PCRE2 +SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_THREADS=64, default=20).
Built with OpenSSL version : OpenSSL 3.2.2 4 Jun 2024
Running on OpenSSL version : OpenSSL 3.2.2 4 Jun 2024
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.4.4
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity(“identity”), deflate(“deflate”), raw-deflate(“deflate”), gzip(“gzip”)
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.40 2022-04-14
PCRE2 library supports JIT : no (USE_PCRE2_JIT not set)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 11.5.0 20240719 (Red Hat 11.5.0-5)
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as cannot be specified using ‘proto’ keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|CLEAN_ABRT|HOL_RISK|NO_UPG
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
: mode=HTTP side=FE|BE mux=H1 flags=HTX
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
: mode=TCP side=FE|BE mux=PASS flags=
Available services : prometheus-exporter
Available filters :
[SPOE] spoe
[CACHE] cache
[FCGI] fcgi-app
[COMP] compression
[TRACE] trace
haproxy config snippets:
Ansible managed: This file is being managed via Ansible and should not be modified directly.
global
daemon
maxconn 2000000
user haproxy
group haproxy
log 127.0.0.1 local0 info
external-check
insecure-fork-wanted
nbproc 1
nbthread 16
cpu-map auto:1/1-16 4-19
tune.ssl.default-dh-param 2048
ssl-default-server-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+
3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS:!LOW:!MEDIUM:!EXP:!DES:!3DES
stats socket /var/run/haproxy.sock mode 4775 group haproxy level admin
defaults
log global
timeout connect 3000ms
timeout client 50000ms
timeout http-request 5000ms
timeout server 9000ms
errorfile 408 /dev/null
### Select default load balancing algorithm
balance leastconn
option redispatch
option tcpka
fullconn 60000
maxconn 2000000
retries 1
frontend f_avance_other_secondary
mode tcp
option tcplog
timeout client 180000ms
log 127.0.0.1 local2 info
bind 10.3.5.192:3306
default_backend b_avance_other_secondary
backend b_avance_other_ssecondary
mode tcp
option mysql-check
timeout server 180000ms
source 10.3.5.196
server pushdb01 10.3.69.247:3306 check
server pushdb02 10.3.6.227:3306 check
server pushdb03 10.3.6.228:3306 check
Merged pcap, frontend/backend, for a mysql test where we can recreate the issue in varying lengths of time.
