HAProxy 2.5.4 arbitrarily dropping responses (backend: 200, haproxy: 500 HP--)

we’ve been using haproxy 2.5 stable, as a replacement for an older version. Since using the new version, we are experiencing a strange issue. Some resources, both static via nginx webserver and dynamic via a rest microservice show random behaviour. The requests most often work, but intermittently fail with an error 500, where haproxy signals the termination state PH–. I’ve checked for invalid headers, content-length and more, and there are none. “show errors” is not showing any errors at all. Bypassing the load-balancer solves the issue (but obviously presents a different problem).

I was wondering how to determine the root cause of this. For the dynamic resource: using ‘replay XHR’ in the browser always succeeds ;-). I have the impression it happens only when part of a series of requests…

There is no 2.5 release. Do you mean you are using 2.5.0 ? In that case, first of all, upgrade to 2.5.4 considering that 2.5.0 contains 76 known bugs, 7 of those are major bugs

I’m sorry, should have been more clear. I have the latest release, 2.5.4:

HAProxy version 2.5.4-1ppa1~focal 2022/02/25 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2023.
Known bugs: http://www.haproxy.org/bugs/bugs-2.5.4.html
Running on: Linux 5.4.0-100-generic #113-Ubuntu SMP Thu Feb 3 18:43:29 UTC 2022 x86_64
Build options :
  TARGET  = linux-glibc
  CPU     = generic
  CC      = cc
  CFLAGS  = -O2 -g -O2 -fdebug-prefix-map=/build/haproxy-7oXbgr/haproxy-2.5.4=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wall -Wextra -Wundef -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
  OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_SYSTEMD=1 USE_PROMEX=1
  DEBUG   = 

Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT +PCRE2 +PCRE2_JIT +POLL +THREAD +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL +LUA +ACCEPT4 -CLOSEFROM -ZLIB +SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL -PROCCTL +THREAD_DUMP -EVPORTS -OT -QUIC +PROMEX -MEMORY_PROFILING

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=2).
Built with OpenSSL version : OpenSSL 1.1.1f  31 Mar 2020
Running on OpenSSL version : OpenSSL 1.1.1f  31 Mar 2020
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Support for malloc_trim() is enabled.
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.34 2019-11-21
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 9.3.0

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
              h2 : mode=HTTP       side=FE|BE     mux=H2       flags=HTX|CLEAN_ABRT|HOL_RISK|NO_UPG
            fcgi : mode=HTTP       side=BE        mux=FCGI     flags=HTX|HOL_RISK|NO_UPG
       <default> : mode=HTTP       side=FE|BE     mux=H1       flags=HTX
              h1 : mode=HTTP       side=FE|BE     mux=H1       flags=HTX|NO_UPG
       <default> : mode=TCP        side=FE|BE     mux=PASS     flags=
            none : mode=TCP        side=FE|BE     mux=PASS     flags=NO_UPG

Available services : prometheus-exporter
Available filters :
	[SPOE] spoe
	[CACHE] cache
	[FCGI] fcgi-app
	[COMP] compression
	[TRACE] trace

Some more info. This is what appears in the log. Enabling debug mode, the headers appear. There are no duplicates, caching headers or invalid content-length. I have the issue both with static chunked resources, and a couple of rest endpoints.

Mar  4 13:58:19 load-balancer-1 haproxy[1491]: 2a02:<redacted>:57887 [04/Mar/2022:13:58:19.181] https~ dashboard-tst/dashboard-tst 0/0/0/-1/47 500 15424 - - PH-- 5/5/2/2/0 0/0 "GET /js/polyfills.c86610ad4dcc15e804f1.js?c86610ad4dcc15e804f1 HTTP/1.1"

Show errors, shows a frontend invalid request (just to check if that works), but nothing on the response in question…

 
[04/Mar/2022:18:05:37.829] frontend http (#2): invalid request
  backend <NONE> (#-1), server <NONE> (#-1), event #0, src ::ffff:66.240.205.34:53678
  buffer starts at 0 (including 0 out), 16308 free,
  len 76, wraps at 16336, error at position 1
  H1 connection flags 0x00000100, H1 stream flags 0x00000810
  H1 msg state MSG_RQMETH(2), H1 msg flags 0x00001400
  H1 chunk len 0 bytes, H1 body len 0 bytes :
  
  00000  H\x00\x00\x00tj\xA8\x9E#D\x98+\xCA\xF0\xA7\xBBl\xC5\x19\xD7\x8D\xB6
  00022+ \x18\xEDJ\x1En\xC1\xF9xu[l\xF0E\x1D-j\xEC\xD4xL\xC9r\xC9\x15\x10u\xE0%
  00050+ \x86Rtg\x05fv\x86]%\xCC\x80\x0C\xE8\xCF\xAE\x00\xB5\xC0f\xC8\x8DD\xC5
  00074+ \t\xF4

This is what happens in the browser. The first request is triggered by javascript, the next by clicking ‘replay xhr’. No modifications on the reverse proxy in any way…

Screen Shot 2022-03-04 at 19.09.57

Please file a bug on github:

Resolved: the rewrite buffer was too small. Adding tune.rewrite 1024 resolved it. The issue that there is no error reporting remains.

1 Like