'mode tcp' performace degradation with two different backends

I’m trying to test new connections rate in ‘mode tcp’ with the following config

frontend default_fe
  mode tcp
  bind *:443
  default_backend default_be

backend default_be
  mode tcp
  balance roundrobin
  # three autonomous apache servers
  server s3_v4   127.0.0.1:8443
  server s1_v4   172.16.37.1:8443
  server s2_v4   172.16.37.2:8443

using and command:

wrk -t12 -c2500 -d10s -H “Connection: close” https://my_site:443/

and the problem is:

  • when I’m using the single backend (either local or remote), RPS is about 10-11k
  • when I’m using two or more different backend (in any combination) the performance degrades to 2-3k RPS

Backends performance is ok - when I’m running three simultaneous wrk to all three backends, every of them gives ~10-11k RPS

The problem appears with either leastconn or roundrobin or even random algo and not present with source or first - so as soon as haproxy needs to consider destinations in balance algo, it degrades.

Will appreciate if anyone can point me on something I’m doing or thinking wrong :slight_smile: Thank you

Examples

Single backend:

root@px:~# wrk -t12 -c2500 -d10s -H "Connection: close" https://xxxxxxxxxxx.com:443/
Running 10s test @ https://xxxxxxxxxxx.com:443/
  12 threads and 2500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    24.96ms   36.87ms   1.91s    95.62%
    Req/Sec     0.90k   293.26     1.58k    73.94%
  100147 requests in 10.09s, 42.98MB read
Requests/sec:   9926.99
Transfer/sec:      4.26MB

Two backends:

root@px:~# wrk -t12 -c2500 -d10s -H "Connection: close" https://xxxxxxxxxxx.com:443/
Running 10s test @ https://xxxxxxxxxxx.com:443/
  12 threads and 2500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    93.16ms  122.31ms   1.09s    87.70%
    Req/Sec   353.96    548.95     2.07k    84.66%
  24908 requests in 10.17s, 10.69MB read
  Socket errors: connect 0, read 0, write 0, timeout 1
Requests/sec:   2449.70
Transfer/sec:      1.05MB

Haproxy configuration (skipping general) is

looks nothing special -

global
    hard-stop-after 1m
    daemon
    # see also `systemctl edit haproxy`
    maxconn 256000
    cpu-policy performance
    # update shards below in listener to == nbthread
    nbthread 16
    cpu-map 1/1  4
    cpu-map 1/2  32
    ...

defaults
    option  dontlognull
    mode    http
    http-reuse always
    backlog 65536
    retries 3
    option  redispatch

    option  nolinger
    timeout connect     5s
    timeout client      35s
    timeout client-fin  40s
    timeout server      35s
    timeout server-fin  40s

    timeout http-request 5s
    timeout http-keep-alive 15s

frontend ident_fe
    mode tcp
    bind *:443 shards 16
    default_backend default_ident_be

backend default_ident_be
    mode tcp
    fullconn 120000
    option tcp-check
    default-server check inter 3s rise 2 fall 2
    balance roundrobin
    server ident3_v4   127.0.0.1:13443
    server ident1_v4   172.16.37.1:13443
    server ident2_v4   172.16.37.2:13443

You have already filed a bug on Github about this:

Let’s use a single support channel, not multiple. Thank you.