Hello!
I have a custom piece of code that does http proxying for me. It has the following logic: every server processes no more than one request at a time. All the servers are equal. When all of the servers are busy, incoming requests are queued up. The first server to finish its work receives the next request from the queue. When X requests are already queued up, respond immediately with 503 error. The idea with limiting the size of the queue is to predict whether current POD will be able to process an incoming request in a reasonable amount of time.
The task is to make a signle queue for incoming service requests since it works much better than individual queues for servers. Requests sizes range from 1kB to 10MB. Requests processing usually takes from 2-3ms to 1s. The protocol is GRPC over http/2 over cleartext tcp/ip.
I’m trying to replicate custom code behavior with HAProxy. I could not find an option to limit the maximum queue size. Is there a way to do that?
Right now I’m using the following config. It uses an intermediate backend to limit the maximum queue size, but I am not happy with an additional hop I had to introduce.
global
nbthread 5
tune.bufsize 524288
tune.h2.initial-window-size 524288
defaults
mode http
log global
log /tmp/unified-agent.sock len 65535 format raw daemon debug
log-format '{"t":"%t","HM":"%HM","HU":"%{json(utf8s)}HU","HV":"%HV","ST":%ST,"B":%B,"H":"%{json(utf8s)}H","Ta":%Ta,"Tc":%Tc,"Td":%Td,"Th":%Th,"Ti":%Ti,"Tq":%Tq,"TR":%TR,"Tr":%Tr,"Ts":%Ts,"Tt":%Tt,"Tu":%Tu,"Tw":%Tw,"U":%U,"ac":%ac,"b":"%b","bc":%bc,"bi":"%bi","bp":"%bp","bq":%bq,"ci":"%ci","cp":%cp,"f":"%f","fc":%fc,"fi":"%fi","fp":%fp,"ft":"%ft","lc":%lc,"ms":"%ms","pid":%pid,"rc":%rc,"rt":%rt,"s":"%s","sc":%sc,"si":"%si","sp":"%sp","sq":%sq,"tr":"%tr","ts":"%ts"}'
option redispatch
timeout connect 3ms
timeout http-request 10s
timeout http-keep-alive 1h
timeout tunnel 1h
timeout queue 10s
timeout client 10s
timeout client-fin 10s
timeout server 10s
timeout server-fin 10s
errorfile 503 fast-error.http # Respond with 200 OK since GRPC does not allow other codes
frontend renderer_frontend
bind :::86 proto h2
default_backend renderer_queue_backend
maxconn 10000
backend renderer_queue_backend
http-reuse always
retries 0
timeout queue 1us # 0 means default, in which case timeout connect value will be used
server queue_limit_single localhost:9001 proto h2 maxconn 10 # No more than 5 requests in progress and 5 queued requests
frontend renderer_queue_frontend
bind :::9001 proto h2
default_backend renderer_workers
maxconn 10000
backend renderer_workers
balance leastconn
http-reuse always
retries 5
server worker_renderer_0 localhost:2002 proto h2 maxconn 1 check inter 1s fall 1 rise 1 observe layer4 error-limit 1
server worker_renderer_1 localhost:2004 proto h2 maxconn 1 check inter 1s fall 1 rise 1 observe layer4 error-limit 1
server worker_renderer_2 localhost:2006 proto h2 maxconn 1 check inter 1s fall 1 rise 1 observe layer4 error-limit 1
server worker_renderer_3 localhost:2008 proto h2 maxconn 1 check inter 1s fall 1 rise 1 observe layer4 error-limit 1
server worker_renderer_4 localhost:2010 proto h2 maxconn 1 check inter 1s fall 1 rise 1 observe layer4 error-limit 1
I’m using haproxy 3.0.4 and I can change it to any version I need.
$ haproxy -v
HAProxy version 3.0.4-1ppa1~jammy 2024/09/03 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2029.
Known bugs: http://www.haproxy.org/bugs/bugs-3.0.4.html
Running on: Linux 5.15.160-9.2