HAProxy causing greatly reduced tcp performance across system

Hey guys, I’ve run into an interesting problem as I’ve started scaling up my smallish video streaming cdn. It seems that haproxy is causing my tcp stack to run into a bottleneck, resulting in incredibly poor single threaded tcp performance across the entire linux stack, not just haproxy. Because I don’t wish to spam iperf3 tests or logs, I’m going to upload these to pastebin and link them. Let’s start with what my sysctls look like:

Some highlights here: max open files set way up, using bbr congestion control, somaxconn cranked up, larger tcp send/transmit buffers, and increased tw bucket size. This sysctl is huge, but basically this is the result of me desperately turning nobs and dials. I experience this same behavior under stock sysctls. I have additionally tried increasing txqueuelen and ring buffers on the ethernet card, tried turning off some of the hardware offloading that was happening, pretty much anything I could think of.

Here is what an iperf3 test SHOULD look like on this machine, this is from Chicago to LA

The retries are normal for bbr, not worried about those

Here is what an iperf3 test looks like while haproxy is running, pushing ~600mbit of traffic, same 10gig connection, yes I have rerun this test many many many times to get similar results

Now here’s what’s interesting, look what happens when you increase parallel streams. I’ve only included the summary page, but this is also between Chicago and LA. The bandwidth is there, my single tcp threads are just not speeding up and I don’t know why

Here is my haproxy -vv output:

And here is part of my haproxy config:

Please note that I have about 1000 of these such entries. To speed up initial load and subsequent reloads, I’m caching the dns with a local bind9 server so that it doesn’t take forever to start and resolve. Additionally, I’m only running about 2000 connections and 1600 sockets, as it’s a video streaming cdn, there aren’t a ton of connections, mostly just raw throughput for high bitrate videos

Some additional steps I have tried are completely disabling healthchecks, dramatically increasing conntrack size, disabling it completely, and considered flying out to chicago and hitting the server with a hammer

Any help would be greatly appreciated, I’m a little bit at my whit’s end here and have no idea what my next steps should be