Haproxy uses a lot of memory with many backend servers

ceecko · June 7, 2018, 10:00pm

Running haproxy with just a single backend with many servers uses considerable amount of memory without any traffic.

The following config makes haproxy use 400MB of memory:

backend bk
  server-template server 1-10000 127.0.0.1:1024 check disabled

On average it’s 41KB / server which seems quite high.
Is there anything I’m missing to be able to reduce the memory footprint?

lukastribus · June 8, 2018, 7:07am

Does not seem high to me at all.

What is that you’d expect and what’s your real use-case?

ceecko · June 8, 2018, 7:48am

The real use-case is a migration from nginx to haproxy to be able to use dynamic reconfiguration via runtime api.
We run thousands of containers which come and go and nginx requires reload of config after each change. This leaves the “old” process running until all connections it served are closed. We’re often running out of memory due to this.

In our tests we run the same config for both nginx and haproxy.
The config has roughly 500 backends, each on average with 3-5 servers.

To be able to add new servers when a new container is created, each backend creates up to 100 servers via server-template. These are disabled until needed.

The nginx config (without spare servers from server-template) uses roughly 70MB of memory.
The haproxy config (with server-template) uses close to 3.5GB of memory.

If I read the docs correctly, it’s not possible to create a new server from runtime api, just to update an existing one. Therefore we use the server-template but the high memory usage was unexpected.

I could decrease the amount of spare servers from 100 to a lower number but I’m wondering if this is the right way to go given the memory requirements.

lukastribus · June 8, 2018, 8:21am

You always have to account for a service restart anyway - whatever the software, what if you need to install a bug or security fix?

Make sure you limit the amount of time a session can be open (haproxy’ hard-stop-after) and don’t reload before hard-stop-after is closed the old process. This way you can achieve predictable RAM usage, even while reloading.

Lowering the number of spare-servers to what you actually need seems like a good idea to me.

I don’t see how we would be able to lower the amount of memory allocated in haproxy for those servers, give the current architecture.

ceecko · June 8, 2018, 2:39pm

Service restarts are fine if it’s only a couple times per hour.
Right now we have to do a reload multiple times per minute which often leaves old processes with 1-2 connections opened and these eat up memory.

Thanks for getting back to me though, I appreciate it.

lukastribus · June 8, 2018, 4:57pm

I see, multiple times per minute is a lot.

If you can live with the haproxy memory usage, maybe using a lower number of preconfigured servers per backend, that’s the only thing that comes to my mind right now.

Truth be told, I can see that your use-case is something that the current haproxy architecture does not cover as well as we would like and I think there is room for improvement.

Right now in this specific minute, there are probably solutions more flexible than haproxy, like traefik or nginx unit (the latter if your backends are just applications).

Baptiste · June 9, 2018, 7:23am

Hi guys,

Well, if you spawned up enough servers in HAProxy backend, why would you still need to reload? (what is the trigger)

(taking into account you have a software which is able to set server IPs / port / weight, etc… on HAProxy’s socket at runtime)

An other point: in the old process, HAProxy cleans up almost everything on the “old” process. I mean that your old processes should not take 3G of RAM, for only 1 or 2 connections…

Second point, the old process of haproxy is supposed to close the connection after the next HTTP response. So over HTTP, the old process should not stand forever (unless you have websockets or long time polling or raw TCP)

ceecko · June 9, 2018, 8:04am

We need to do reloads for 2 reasons:

~50 backends are added/removed per day
~15 SSL certificates are added/replaced per day

To my knowledge backends and SSL certificates cannot be managed via runtime API.
Doing only a couple reloads per day is better than with every server change

I’m going to do more tests to see what the memory usage of old process is.
Since we have plenty of websockets, the old processes stick around for some time.

ceecko · June 9, 2018, 3:48pm

Did some more tests on memory usage during config reload.
Config reload is done through systemd:

[Service]
Environment="CONFIG=/etc/haproxy/" "PIDFILE=/run/haproxy.pid"
ExecStartPre=/usr/local/sbin/haproxy -f $CONFIG -c -q
ExecStart=/usr/local/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE
ExecReload=/usr/local/sbin/haproxy -f $CONFIG -c -q
ExecReload=/bin/kill -USR2 $MAINPID
KillMode=mixed
Restart=always
Type=notify

I started haproxy and opened a websocket connection after each reload.
With 3 websocket connections haproxy keeps 4 worker processes running - 3 old, 1 new.
Each worker process consumes roughly 1.4GB of memory with just one connection opened.

[root@haproxy ~]# ps axuf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      1791  1.8 18.4 1531768 1439832 ?     Ss   17:28   0:09 /usr/local/sbin/haproxy -Ws -f /etc/haproxy/ -p /run/haproxy.pid -sf 2307 2259 2139 -x /var/run/haproxy.sock
haproxy   2139  3.8 18.4 1537236 1442784 ?     Ss   17:34   0:07  \_ /usr/local/sbin/haproxy -Ws -f /etc/haproxy/ -p /run/haproxy.pid -sf 2125 -x /var/run/haproxy.sock
haproxy   2259  4.4 18.4 1537236 1442708 ?     Ss   17:36   0:03  \_ /usr/local/sbin/haproxy -Ws -f /etc/haproxy/ -p /run/haproxy.pid -sf 2234 2139 -x /var/run/haproxy.sock
haproxy   2307  4.6 18.4 1537236 1442704 ?     Ss   17:36   0:02  \_ /usr/local/sbin/haproxy -Ws -f /etc/haproxy/ -p /run/haproxy.pid -sf 2259 2139 -x /var/run/haproxy.sock
haproxy   2338  6.9 18.4 1537236 1442320 ?     Ss   17:37   0:01  \_ /usr/local/sbin/haproxy -Ws -f /etc/haproxy/ -p /run/haproxy.pid -sf 2307 2259 2139 -x /var/run/haproxy.sock

There are 416 backends in the config each with 101 servers. In total 42k servers out of which 41.5k are disabled.

Closing the websocket connections makes haproxy stop the old processes properly.

Baptiste · June 10, 2018, 11:20am

Ok, thx for the feedback.

By the way, you did not answer to one question: what would trigger an HAProxy reload in your environment?

ceecko · June 10, 2018, 12:59pm

I submitted two posts - the first one mentioned reasons for reload

willy · June 19, 2018, 7:43am

There’s something odd with your configuration (though I don’t know what). I reported some time ago starting 1 million servers with 2.7 GB of RAM, which would be around 2.7 kB per server. In your case you’re reporting 20 times more memory. I don’t really know what can make the difference at this point. It would help if you could share your configuration.

Also another point, but if you’re using 3-5 servers per backend, I hardly see the benefit of configuring 100, except for purposely using more RAM. Your description makes me think you’re not very sure what you want to do and you’re checking if you can cover for the worst possible case just to be sure. At the very least we should be certain that you’re not needlessly wasting resources.

ceecko · June 19, 2018, 9:29am

@willy - I tried to make the configuration really basic, nothing fancy.

01-defaults.cfg

global
    chroot /var/lib/haproxy
    user haproxy
    group haproxy
    maxconn 10000
    tune.ssl.default-dh-param 2048
    ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA
    stats socket /var/run/haproxy.sock mode 600 level admin expose-fd listeners

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    option  forwardfor
    option  redispatch
    option  http-server-close
    
    timeout connect 5000
    timeout client  50000
    timeout server  50000

    balance source

frontend www-http
    bind *:80
    bind *:443 ssl strict-sni crt /data/ssl/custom/haproxy/

    http-request add-header Connection upgrade
    http-request add-header X-Forwarded-Host %[req.hdr(host)]
    http-request add-header X-Forwarded-Server %[req.hdr(host)]
    http-request add-header X-Forwarded-Proto http if !{ ssl_fc }
    http-request add-header X-Forwarded-Proto https if { ssl_fc }

    acl is_well_known path_beg /.well-known/acme-challenge/

    use_backend well_known if is_well_known
    use_backend bk_%[req.hdr(host),lower,map_str(/etc/haproxy/domain-to-app.map,no_match)]

backend bk_no_match
    http-request deny

02-backends.cfg

# list of all backends
backend bk_1
  server server1 127.0.0.1:1024
  server server2 127.0.0.1:1025
  server-template server 3-100 127.0.0.1:8080 check disabled

backend bk_2
  server server1 127.0.0.1:1026
  server-template server 2-100 127.0.0.1:8080 check disabled

backend bk_3
  ...

The use case is rolling restart of apps. This means that each app uses n servers to run and requires another n servers for a rolling restart (we need to do a rolling restart of all servers at once). In total 2*n servers are needed per each backend for such restart.

At first I didn’t expect haproxy would consume so much memory with each server (coming with background from nginx). Our maximum number of servers is 50 per backend so I created 100 servers by default.

Regardless of the number of servers - the memory consumption per server is higher than nginx and since we need to do a config reload a couple times per hour, the memory consumption would skyrocket until there are open connections in the old processes.

willy · June 19, 2018, 3:54pm

OK I reproduce the same here, it’s the check which adds the check buffers to each server. We once planned to use dynamic buffer allocations for checks, I’ll check how it’s in 1.9, maybe we did it there already. With the checks I have 400MB for 10k servers, Without the check it only 38MB.

willy · June 19, 2018, 4:16pm

So I’ve found what we had done for this, it was only in one of the experimental branches with the connection layers rework and checks were not yet ported. I looked a bit, technically it’s not very hard, just a little bit tricky. Depending on how things go for 1.9 that could be something we integrate there. I also noticed that something is using much more RAM per server in 1.9 than 1.8, we’ll have to spot what and ensure it’s not permanent.

Now if you or someone you know is interested in working on this for 1.9, just send a mail on the mailing list so that we can discuss what needs to be done.

nimdrak · February 9, 2022, 7:08am

@willy Hi. Thanks for your sharing expertise. Could I ask you something?

I try to use HAProxy with 200 servers. (Currently these servers are connected to HWLB and I try to migrate it to HAProxy with DNS round robin)
I will run HAProxy above 40 machines (Xeon Silver 4210 (2.2GHz/10core)*2, RAM 128GB, 25G NIC)
and add 200 servers at its HAProxy config backend section.
For avoiding unexpected failure when using HAProxy, I want to estimate how much resources this HAProxy configure consumes.
As I read the above discussion, the memory is not an issue because each server takes only about some KB.
Is there any other issue such as CPU usage? Or how can I check it without testing directly.
I want to test it directly but I can’t stop these 200 service server.

willy · February 9, 2022, 8:50am

Well, your sizing will essentially depend on the traffic. The number of concurrnent connections will directly impact the memory usage (count between 17 and 35 kB per active client connection roughly), and the CPU usage will depend on the data rate, and the number of TLS handshakes per second.

What I can say already is that you must absolutely not use the two physical CPUs in your machine on the same process. Either you start haproxy on one of them and pin the network cards to the other one (if you plan to saturate your 25G NICs for example), or if you’re expecting massive TLS handshakes (e.g. you’re serving ads) then you should start one process per physical CPU. And in this case if you have two NICs, arrange them on the PCIe slots so that each one is physically connected to a different CPU and run each instance on its own CPU and NIC, this will work almost like two machines. But the number of servers is mostly irrelevant here. Also please note that the comment above is 4 years-old and that since then the check buffer has been changed to a dynamic allocation so that servers do not waste a buffer anymore when the checks are not running.

nimdrak · February 9, 2022, 11:32am

I really appreciate your advices because I’m wandering at Google for clarifying how to check the effect of the number of servers at backend. Thank you so much.
I summarize your advices.

Rather than the number of servers, the concurrent connections(Memory Usage) and the traffic rate, the the number of TLS handshakes per second(CPU Usage) will consume the most of computing resources, CPU and RAM.
Servers do not waste a buffer anymore when the checks are not running.
And NIC and HAProxy should run on the same CPU.

Could I ask some questions more?

As I’m a newbie at sw and network area, I don’t know what to do for estimating the consumed resources (such as 2.7kB memory per server, 17 and 35 kB memory per active client connection). How can I estimate these things by myself? For example, for estimating the effect of number of servers at backend at HAProxy, we can’t buy larger number number of servers` just for testing. I try to estimate it by reading the HAProxy source code but it seems not to help.
You said data rate and TLS handshake burden CPU. Is it from establishing connections towards backend and forwarding data and connection?
To make NIC and HAProxy run on the same CPU, do I need to set some config at haproxy.cfg or some linux kernel config? (I check your other post, Architectural limitation for nbproc? - #6 by willy. It said numactl)

I’m sorry if these questions are too basic and Thank you for your time.

nimdrak · February 12, 2022, 4:00am

@willy Sorry to mentioning you. Could you share your expertise?

lukastribus · February 12, 2022, 8:02am

Stop focusing so much on haproxy and start focusing on your load and current numbers. How much traffic are you current moving? How many concurrent connections do you have, and how many new sessions per second are you turning up with SSL termination or without?

Once you know that, it’s simple multiplication.

You can pin haproxy to a CPUs cpu-map.

For the interrupts part check https://access.redhat.com/solutions/2144921.

However I’d suggest you avoid over-engineering from day zero.

Topic		Replies	Views
HAProxy constant downwards trend on memory usage Help!	4	670	July 13, 2021
Haproxy keep increase used memory until OOM Help!	9	4815	November 11, 2019
Haproxy session rate slower than single web server Help!	8	2429	January 5, 2018
100+ haproxy processes Help!	15	1174	April 3, 2019
Haproxy slowing down after several days of uptime Help!	10	1835	March 23, 2021

Haproxy uses a lot of memory with many backend servers

Related topics