HAProxy community

HAproxy returns 408 or 504 error when timeout client value is every 25d

Hello,
I am using HAproxy with version 1.5.18.
There is a weird pattern for timeout client value which cause HAproxy works or return 408/504 error when added it to the fronted section.

frontend http_vip
mode http
bind x.x.x.x:80 transparent
option httplog clf
log global

timeout client 25d
default_backend http-pool

For example:
Values equal or above 1us and below 25d are working (application behind the HAproxy can be accessed).
Values equal or above 25d and below 50d are NOT working (application behind the HAproxy can NOT be accessed).
Values equal or above 50d and below 75d are working
Values equal or above 75d and below 100d are NOT working
Values equal or above 100d and below 125d are working
Values equal or above 125d and below 150d are NOT working
and this pattern continues forever every 25 days.

When the application cannot be accessed, a 408 or 504 error is returned.
In the access log, the session is terminated with “cC” state as following:
“GET /index.html HTTP/1.1” 504 194 “” “” 58568 614 “http_vip” “http-pool” “m1” 0 0 0 -1 0 cC-- 0 0 0 0 0 0 0 “” “”

I have tried with tcp mode, still has the same result.

Pls help to debug on this.

Thanks.

Can you provide the output of haproxy -vv?

HA-Proxy version 1.5.18 2016/05/10
Copyright 2000-2016 Willy Tarreau willy@haproxy.org

Build options :
TARGET = linux26
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing
OPTIONS = USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_OPENSSL=1 USE_STATIC_PCRE=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built without zlib support (USE_ZLIB not set)
Compression algorithms supported : identity
Built with OpenSSL version : OpenSSL 1.0.2p-fips 14 Aug 2018
Running on OpenSSL version : OpenSSL 1.0.2p-fips 14 Aug 2018
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.42 2018-03-20
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Timeouts with huge values like this are certainly not something that was expected. Tick handling will wrap at 24.85 days, see:

Maybe we can document this maximum tick value and reject such invalid configurations, so there is no doubt about this.

Any thoughts on this @willy ?

Hi Lukas!
Well I’m embarrassed, I was absolutely certain it was documented… and I was wrong! The only place where I find such a mention in the doc is on the stick-table expire keyword. Not only we need to add to add it in the configuration, but we also need to check it during parsing (and yes, I was pretty sure it was tested as well). Let’s fix this before 2.0. Ideally we need to report an error if a non-null timeout is null after conversion, and if a timeout overflows. We can probably have parse_time_err() report two dummy pointers for “underflow” and “overflow” so that callers can report errors.

I created issue #109 with this.

This is now fixed in 2.0-dev.