Backend connect latency spikes on backend queue

Hi, we’ve observed an increase in backend connect time on all backends under the same frontend as soon as one starts to queue.

Just wondering if this is an expected behaviour or our configuration would benefit from some meaningful tuning?
For this example, bck1 starts to queue up to 10k+ requests on traffic spikes which is not an issue here but we can see at the time that other backends like bck2 (no queue, same load as before) starts to have larger backend connect times which then has a repercussion of processing less traffic (waiting for requests). Latency can spike from nominal 10ms up to 3s…

Some examples of the current configuration:
We are using [HAProxy version 1.8.12-8a200c7, released 2018/06/27]. Running on 1 process and thread.

pid = 14332 (process #1, nbproc = 1, nbthread = 1)
system limits: memmax = unlimited; ulimit-n = 200039
maxsock = 200039; maxconn = 100000; maxpipes = 0

Total number of connections established are way under the set limits (25k/req at max) and queue gets to (10-15k)

defaults
mode http
timeout client 10s
timeout server 1800s
timeout connect 10s
timeout queue 60s
timeout check 1s
load-server-state-from-file global
option forwardfor
no option forceclose
no option tcp-smart-accept
option http-no-delay
log global
log 127.0.0.1 len 5120 local0
option dontlognull
balance leastconn

frontend api
bind :80
bind :443 ssl crt ./
maxconn 100000
option http-buffer-request

backend bck1
fullconn 30000
option httpchk HEAD /rmi/status
http-check disable-on-404
server : check slowstart 20s maxconn 20

backend bck2
fullconn 30000
option httpchk HEAD /rmi/status
http-check disable-on-404
server : check slowstart 20s maxconn 500
server : check slowstart 20s maxconn 500
server : check slowstart 20s maxconn 500

Example of metrics from stress tests (no such huge spike as on production traffic but notable)


CPU busy is 100% on queue load on the haproxy servers itself and memory load jumps by ~400MB but there is still room to spare on the instances. Do note, CPU usage on production did not get above 30% on any node

Here are the metrics from the production traffic where you can clearly see in red when the queuing occurs, backend connect latency spikes on all backends. So is this an expected behavior or we can further tune the configuration to try and mitigate this? All of this traffic is http/https, haproxies are behind keepalived instances running least connection algorithm. Haven’t seen any difference using round-robin algo.

Thanks and any help appreciated!

EDIT:
adding verbose version output (all are haproxies are built on the same hardware with same build flags)

HA-Proxy version 1.8.12-8a200c7 2018/06/27
Copyright 2000-2018 Willy Tarreau willy@haproxy.org

Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-unused-label
OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_SYSTEMD=1 USE_PCRE=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2k-fips 26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-fips 26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.7
Running on zlib version : 1.2.7
Compression algorithms supported : identity(“identity”), deflate(“deflate”), raw-deflate(“deflate”), gzip(“gzip”)
Built with network namespace support.

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace

Provide haproxy logs. This will make it clearer where the latency is spend.

1 Like

Here is the log file (6mb compressed, 148mb uncompressed) https://filebin.net/bguucs8wftnlb4dp/haproxy.log.tar.bz2?t=mnklfo8v
that matches this Grafana graph on the time of the larger queue and latency spike

Once more, here is the redacted configuration with the used options:
Note - we use option http-buffer-request since slow loris attack was were easy to reproduce and this helps avoid it. Thank you and if anyone has some other idea, shoot!

  global
    stats socket /var/run/haproxy_socket level admin
    server-state-file /var/lib/haproxy/haproxy_server_state
    user haproxy
    group haproxy
    chroot /var/empty/haproxy
    maxconn 100000
    ca-base /etc/ssl
    crt-base /etc/ssl
    tune.ssl.default-dh-param 2048
    tune.ssl.cachesize 1000000
    tune.http.logurilen 5120
    ssl-default-bind-options no-sslv3
    ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA

  defaults
    mode http
    timeout client 10s
    timeout server 1800s
    timeout connect 10s
    timeout queue 60s
    timeout check 1s
    load-server-state-from-file global
    option forwardfor
    no option forceclose
    no option tcp-smart-accept
    option http-no-delay
    log global
    log 127.0.0.1 len 5120 local0
    option dontlognull
    balance leastconn
    log-format %ci:%cp\ [%t]\ %ft\ %b/%s\ %TR/%Tw/%Tc/%Tr/%Ta\ %ST\ %B\ %CC\ %CS\ %tsc\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %hr\ %hs\ %{+Q}r\ %sslv\ %sslc
    default-server init-addr libc,last,none

  resolvers dns
    nameserver dns_balancer_dc1 X.X.X.X:53
    nameserver dns_balancer_dc2 X.X.X.X:53
    resolve_retries 3
    hold valid 10s

  frontend API
    bind X.X.X.X:80
    bind X.X.X.X:443
    ...
    maxconn 100000
    option http-buffer-request
    tcp-request connection track-sc1 src table haproxy-client-metrics
    ....
    acl ....
    use_backend ...

  backend backend-1
    fullconn 30000
    server ....

  backend backend-2
    fullconn 30000
    server ....

  backend backend-3
    fullconn 30000
    server ....

  backend backend-4
    fullconn 30000
    server ....

  backend haproxy-client-metrics
    stick-table type ip size 1m expire 30s store conn_cur,conn_rate(5s),http_req_rate(5s),http_err_rate(5s),bytes_in_rate(30s)

Both the log and the config are so heavily anonymized that I have no chance of understanding how they relate to each other.

If haproxy saturates an entire CPU core, which I believe you are saying, then the connect time will definitely go up, as haproxy cannot handle the events in time.

With the informations provided I can give some generic advise. Don’t use option http-no-delay, don’t use option http-buffer-request and upgrade to latest 1.8 stable, as there are over 200 known bugs in 1.8.12.

1 Like