We have haproxy installed on a server which is being used primarily for front ending TLS. After session establishment it sets certain headers in the http request and forwards it to the application in the backend. The back end application is a tftp server and hence it can receive requests from a large number of clients.
What we observe on our server is that when we have large number of clients haproxy gets quite busy and the CPU clocks pretty high. Since both haproxy and our backend application run on the same server - this combined CPU can get close to the limit.
What we’d like to know is if there is a way to throttle the number of requests per second. All the searches so far - seem to indicate that we could rate limit based on src ip or http header. However, since our client ips will be different in the real world we wont be able to use that (less recurrence)
Could you please help? Is this possible?
Considering the fact that you log high CPU utilization at high load, the very first thing i would recommend you to ensure is that your CPU cores are not over/under utilized. When you are using HAProxy on a machine with multiple cores you need to tune haproxy to utilize all the cores by forking multiple processes of HAProxy. You may achieve this using nbproc parameter as shown below:
##If you have a system with 4 CPU cores, it is recommended to run one haproxy process per core.
Once you are done with tuning haproxy for the number of cores, the next step should be throttling the rate of requests. There are 2 ways to throttle your requests:
Rate limiting requests per second using request rate counters in stick tables and ACLs.
Limiting the number of concurrent connections using the maxconn parameter.
Out of the above two options, i would recommend you the second one. Reason being, the use of request queues when the total number of requests reach the maxconn limit. Contrary to this in option 1, the new requests are dropped and not accepted if the maximum request limit is reached. This could lead to a bad user experience.
You can define maxconn limits at three locations:
maxconn in global section: to limit the number of concurrent connections allowed per haproxy process.
maxconn in frontend section: to limit the number of concurrent connections allowed per frontend.
server maxconn: to limit the number of concurrent connections allowed per server in a backend.
You may set appropriate value for these parameters depending on your rate of request and desired throughput.
Refer to below post for further details on maxconn parameter: