The real use-case is a migration from nginx to haproxy to be able to use dynamic reconfiguration via runtime api.
We run thousands of containers which come and go and nginx requires reload of config after each change. This leaves the “old” process running until all connections it served are closed. We’re often running out of memory due to this.
In our tests we run the same config for both nginx and haproxy.
The config has roughly 500 backends, each on average with 3-5 servers.
To be able to add new servers when a new container is created, each backend creates up to 100 servers via server-template. These are disabled until needed.
The nginx config (without spare servers from server-template) uses roughly 70MB of memory.
The haproxy config (with server-template) uses close to 3.5GB of memory.
If I read the docs correctly, it’s not possible to create a new server from runtime api, just to update an existing one. Therefore we use the server-template but the high memory usage was unexpected.
I could decrease the amount of spare servers from 100 to a lower number but I’m wondering if this is the right way to go given the memory requirements.
You always have to account for a service restart anyway - whatever the software, what if you need to install a bug or security fix?
Make sure you limit the amount of time a session can be open (haproxy’ hard-stop-after) and don’t reload before hard-stop-after is closed the old process. This way you can achieve predictable RAM usage, even while reloading.
Lowering the number of spare-servers to what you actually need seems like a good idea to me.
I don’t see how we would be able to lower the amount of memory allocated in haproxy for those servers, give the current architecture.
Service restarts are fine if it’s only a couple times per hour.
Right now we have to do a reload multiple times per minute which often leaves old processes with 1-2 connections opened and these eat up memory.
Thanks for getting back to me though, I appreciate it.
If you can live with the haproxy memory usage, maybe using a lower number of preconfigured servers per backend, that’s the only thing that comes to my mind right now.
Truth be told, I can see that your use-case is something that the current haproxy architecture does not cover as well as we would like and I think there is room for improvement.
Right now in this specific minute, there are probably solutions more flexible than haproxy, like traefik or nginx unit (the latter if your backends are just applications).
Well, if you spawned up enough servers in HAProxy backend, why would you still need to reload? (what is the trigger)
(taking into account you have a software which is able to set server IPs / port / weight, etc… on HAProxy’s socket at runtime)
An other point: in the old process, HAProxy cleans up almost everything on the “old” process. I mean that your old processes should not take 3G of RAM, for only 1 or 2 connections…
Second point, the old process of haproxy is supposed to close the connection after the next HTTP response. So over HTTP, the old process should not stand forever (unless you have websockets or long time polling or raw TCP)
To my knowledge backends and SSL certificates cannot be managed via runtime API.
Doing only a couple reloads per day is better than with every server change
I’m going to do more tests to see what the memory usage of old process is.
Since we have plenty of websockets, the old processes stick around for some time.
I started haproxy and opened a websocket connection after each reload.
With 3 websocket connections haproxy keeps 4 worker processes running - 3 old, 1 new.
Each worker process consumes roughly 1.4GB of memory with just one connection opened.
There’s something odd with your configuration (though I don’t know what). I reported some time ago starting 1 million servers with 2.7 GB of RAM, which would be around 2.7 kB per server. In your case you’re reporting 20 times more memory. I don’t really know what can make the difference at this point. It would help if you could share your configuration.
Also another point, but if you’re using 3-5 servers per backend, I hardly see the benefit of configuring 100, except for purposely using more RAM. Your description makes me think you’re not very sure what you want to do and you’re checking if you can cover for the worst possible case just to be sure. At the very least we should be certain that you’re not needlessly wasting resources.
# list of all backends
backend bk_1
server server1 127.0.0.1:1024
server server2 127.0.0.1:1025
server-template server 3-100 127.0.0.1:8080 check disabled
backend bk_2
server server1 127.0.0.1:1026
server-template server 2-100 127.0.0.1:8080 check disabled
backend bk_3
...
The use case is rolling restart of apps. This means that each app uses n servers to run and requires another n servers for a rolling restart (we need to do a rolling restart of all servers at once). In total 2*n servers are needed per each backend for such restart.
At first I didn’t expect haproxy would consume so much memory with each server (coming with background from nginx). Our maximum number of servers is 50 per backend so I created 100 servers by default.
Regardless of the number of servers - the memory consumption per server is higher than nginx and since we need to do a config reload a couple times per hour, the memory consumption would skyrocket until there are open connections in the old processes.
OK I reproduce the same here, it’s the check which adds the check buffers to each server. We once planned to use dynamic buffer allocations for checks, I’ll check how it’s in 1.9, maybe we did it there already. With the checks I have 400MB for 10k servers, Without the check it only 38MB.
So I’ve found what we had done for this, it was only in one of the experimental branches with the connection layers rework and checks were not yet ported. I looked a bit, technically it’s not very hard, just a little bit tricky. Depending on how things go for 1.9 that could be something we integrate there. I also noticed that something is using much more RAM per server in 1.9 than 1.8, we’ll have to spot what and ensure it’s not permanent.
Now if you or someone you know is interested in working on this for 1.9, just send a mail on the mailing list so that we can discuss what needs to be done.
@willy Hi. Thanks for your sharing expertise. Could I ask you something?
I try to use HAProxy with 200 servers. (Currently these servers are connected to HWLB and I try to migrate it to HAProxy with DNS round robin)
I will run HAProxy above 40 machines (Xeon Silver 4210 (2.2GHz/10core)*2, RAM 128GB, 25G NIC)
and add 200 servers at its HAProxy config backend section.
For avoiding unexpected failure when using HAProxy, I want to estimate how much resources this HAProxy configure consumes.
As I read the above discussion, the memory is not an issue because each server takes only about some KB.
Is there any other issue such as CPU usage? Or how can I check it without testing directly.
I want to test it directly but I can’t stop these 200 service server.
Well, your sizing will essentially depend on the traffic. The number of concurrnent connections will directly impact the memory usage (count between 17 and 35 kB per active client connection roughly), and the CPU usage will depend on the data rate, and the number of TLS handshakes per second.
What I can say already is that you must absolutely not use the two physical CPUs in your machine on the same process. Either you start haproxy on one of them and pin the network cards to the other one (if you plan to saturate your 25G NICs for example), or if you’re expecting massive TLS handshakes (e.g. you’re serving ads) then you should start one process per physical CPU. And in this case if you have two NICs, arrange them on the PCIe slots so that each one is physically connected to a different CPU and run each instance on its own CPU and NIC, this will work almost like two machines. But the number of servers is mostly irrelevant here. Also please note that the comment above is 4 years-old and that since then the check buffer has been changed to a dynamic allocation so that servers do not waste a buffer anymore when the checks are not running.
I really appreciate your advices because I’m wandering at Google for clarifying how to check the effect of the number of servers at backend. Thank you so much.
I summarize your advices.
Rather than the number of servers, the concurrent connections(Memory Usage) and the traffic rate, the the number of TLS handshakes per second(CPU Usage) will consume the most of computing resources, CPU and RAM.
Servers do not waste a buffer anymore when the checks are not running.
And NIC and HAProxy should run on the same CPU.
Could I ask some questions more?
As I’m a newbie at sw and network area, I don’t know what to do for estimating the consumed resources (such as 2.7kB memory per server, 17 and 35 kB memory per active client connection). How can I estimate these things by myself? For example, for estimating the effect of number of servers at backend at HAProxy, we can’t buy larger number number of servers` just for testing. I try to estimate it by reading the HAProxy source code but it seems not to help.
You said data rate and TLS handshake burden CPU. Is it from establishing connections towards backend and forwarding data and connection?
To make NIC and HAProxy run on the same CPU, do I need to set some config at haproxy.cfg or some linux kernel config? (I check your other post, Architectural limitation for nbproc? - #6 by willy. It said numactl)
I’m sorry if these questions are too basic and Thank you for your time.
Stop focusing so much on haproxy and start focusing on your load and current numbers. How much traffic are you current moving? How many concurrent connections do you have, and how many new sessions per second are you turning up with SSL termination or without?