Haproxy tunning for higher TPS(12K-15K)

Hi,

I have setup haproxy(layer4) in tcp mode in front of my 4 nginx servers(layer 7). the haproxy load balance traffic to nginx servers in round robin fashion.

We are performing performance testing for this setup for handling high TPS ~ 10K - 15K. but at around 8K tps, we start observing high response time for our apis. along with this the connect time to backend servers increases instantly from ~0ms to 80 ms when the tps hit 8K. also the SYN-SENT connection count increases when this happens.

nginx are serving static content. there are four apis configured with different payload
1) /test1 - 1 kb size with 250 ms delay at nginx to replicate prod apis which take 250 ms while processing a request (traffic distribution - 5%)
2) /test2 - 2 kb size with 150 ms delay at nginx to replicate prod apis which take 150 ms while processing a request (traffic distribution - 40%)
3) /test40 - 40 kb size with 200 ms delay at nginx to replicate prod apis which take 200 ms while processing a request (traffic distribution - 15%)
4) /testbytes - 450 bytes size with 350 ms delay at nginx to replicate prod apis which take 350 ms while processing a request (traffic distribution - 40%)

need urgent help to resolve this issue!

below is my haproxy config

#########################################
global
log 127.0.0.1 local0
chroot /var/lib/haproxy
maxconn 100000
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon

stats socket /var/lib/haproxy/stats
nbthread 15
cpu-map auto:1/1-15 0-14

defaults
mode tcp
log global
option tcplog
option dontlognull
maxconn 100000

timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m

timeout check 10s

listen stats
bind *:3000
mode http
log 127.0.0.1 local1
stats enable
stats realm Haproxy\ Statistics
stats uri /stats
stats hide-version
stats auth haproxy:haproxy@197

frontend haproxy
bind *:443
log 127.0.0.1 local2

    default_backend wall-nginx

backend wall-nginx
mode tcp
balance roundrobin
server hostname-1 x.x.x.x:443 check
server hostname-2 x.x.x.x:443 check
server hostname-3 x.x.x.x:443 check
server hostname-4 x.x.x.x:443 check

How are you generating the load?

How many concurrent connections? Are they using keep-alive? How many requests per connection until you close and open a new one?


BTW, given how powerful HAProxy is with regard to HTTP routing (and given that I assume at some moment you’ll want to enable HTTPS for your API andpoints), why aren’t you using HAProxy in HTTP mode?

(I would expect some additional load given the HTTP parsing and handling, but I think it opens a lot of possibilities, especially regarding path-based routing, perhaps rate limiting, etc.)