Limit concurrent requests per session

Hello, haproxy users!

Is it possible to limit the number of currently running HTTP requests per client (as opposed to the request rate), with “client” being represented by a session cookie?

For example, if I have 10 backend servers with maxconn 30 each, one client behind a slow line can issue 300 concurrent requests for something big, and exhaust all the backend connections.

I am fine with a client issuing requests as fast as possible, provided that it issues a next request only after the previous one has finished. Moreover, I don’t want requests of a malfunctioning client to be dropped, only queued somewhere.

I tried the following approach:

frontend myserver
        ...
        capture cookie session= len 15 # for logging
        stick-table type string len 8 size 100k store conn_cur expire 10m
        http-request track-sc0 cookie(session)
        default_backend trash-be

        use_backend prime-be if { cookie("session") -m found } { src_conn_cur lt 5 }

backend trash-be
        stick-table type string len 6 size 16k expire 10m
        stick on cookie(session)

        server worker01 10.0.0.101:443 maxconn 2
        ...
        server worker10 10.0.0.110:443 maxconn 2

backend prime-be # same workers with higher maxconn:
        server worker01 10.0.0.101:443 maxconn 30
        ...
        server worker10 10.0.0.110:443 maxconn 30

I expect one session to have at most 4 running concurrent requests routed to prime-be, and the rest (should there be any) redirected to trash-be, where it would stick on the same backend server (at most two additional requests) and be queued otherwise.

The above apparently does not work - today I have seen one client issuing many 900+ seconds lasting download requests in paralllel, all being forwarded to prime-be for some reason, exhausting all the 300 connections.

One of the problems with the above approach is that it counts connections, not requests. So if the backends speak HTTP/1.1 only (as mine do) and the malfunctioning client issues many requests inside a single HTTP/2 connection, it still fills all the slots on prime-be, because it is a single connection.

However, I have checked the logs, and I see this also with HTTP/1.1 requests from one IP address and different source ports, so is not a problem of HTTP/2 requests inside a single connection only.

Is it possible to track running requests instead of src_conn_cur? Is there another approach how to limit in-flight requests per session?

Thanks,

-Yenya