HAProxy maxconn not working or maxconn exceeded

Hi everyone, we are running HAProxy server. Our haproxy configuration file has

global maxconn 16384
defaults max-keep-alive-queue 300
server varnish01 yyy.yyy.yyy.yy::80 check maxconn 1000
server varnish01 zzz:zzz:zzz:zz::80 check maxconn 1000

Note that, zzz:zzz:zzz:zz and yyy:yyy:yyy:yy ip address

backend bk_shop_cluster,
timeout queue 5s
mode http
balance leastconn
server nginx01 10.10.10.215:443 ssl verify none check maxconn 50
server nginx02 10.10.10.216:443 ssl verify none check maxconn 50

I want 50 connections create per server. But every time my maxconn is overloaded or exceeded. So it creates high CPU utilization and memory usage. How can i get rid of this problem?

Provide the ENTIRE configuration. Make sure you don’t use nbproc and that only a single instance of haproxy is running.

How do you come to the conclusion that maxconn is surpassed exactly?

Through request count in access log within 1 second.

This is confirmed. Only one 1 HAProxy instancee/process is running in 1 VM.

Then you are misunderstanding what maxconn is.

maxconn means the maximum concurrent connections to this server.

This has nothing to do with request per second at all.

You can have maxconn 1 on a server and forward 1000 requests per second.

Is that case how can we ensure the concurrent requests per second to backend servers?
Let’s say I want to allow max 100 requests/second to be landed at backend application server through HAProxy.

there is no “concurrent requests per second”. Either we are talking about concurrency or we are talking about per second numbers.

Your use-case is to limit CPU and memory usage. Then stop thinking about request per second, because that is the wrong approach to this.

You want to limit the actual number of in flight (that is - concurrent) requests, and that is what maxconn does.

You just need to think about it differently.

If haproxy only allows 50 connections, this means you can never have more than 50 requests in-flight (concurrency), and this is what helps you reduce CPU and memory usage.

The faster the server, the more request per second it will handle, but it will never exceed 50 in-flight transactions.

And this is the correct metric to limit CPU and memory on your backend servers.

Thinking about request per second numbers is wrong.

is there any way to calculate the number of requests under each connections?

The amount of concurrent requests is equivalent to the maxconn value.

The amount of requests or transactions per second depends on the duration of a transaction. Once you know that, it’s just basic math.