Load Balancing Speed Issues

We are trying to see why the speed of load balancing is slower than connecting to the server itself.

Version 2.4.17 and any other version we tested

We have a simple go http server setup on 3 different VMs each with a URL request of /ping and a reply of text with pong. We also with a url to insert an entry to MariaDB to test speed for inserts.

wrk -c 200 -t 2 -d 5s http://10.x.x.x/ping

Hitting /ping on each server individually gives us ~290,000 requests per second.

wrk -c 200 -t 2 -d 5s http://10.x.x.x/test

Hitting /test to insert to the DB gives us ~50,000 requests per second.

When hitting the HAProxy IP address we only get at most ~70,000 rps to /ping with 3 servers on the backend and only about 25,000 rps to /text

Is there any configuration guidance for AlmaLinux 9.1 with HaProxy 2.4.17 and/or tuning the OS?

We have setup various tuning we have found online however the CPUs never max out and we cannot get any increase via HAProxy.

HAProxy server is 20gig memory and 20 cpus.

The VMs are on the same machine and ping to each is <.5ms from the HAProxy server

We would assume that with about 300k per back end, we should see around 900k with HAProxy

Same with 55k per DB test server to give us ~150k rps

Any help would be appreciated.

EDIT:

Changing mode from HTTP to TCP jumped the RPS to 120k on /ping but still no where near what 1 server can do on its own.