HAProxy community

CPU Affinity - nbproc>1 and nbthread>1

When running HAProxy with more than one CPU socket, it is [recommended](https://www.haproxy.com/documentation/hapee/2-1r1/configuration/system-tuning/#pin-network-interrupts-to-cores} that HAProxy should run with CPU affinity hence nbproc=2 and each process with its NUMA cores:

:~# lscpu 
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
CPU(s):              40
On-line CPU(s) list: 0-39
Thread(s) per core:  2
Core(s) per socket:  10
Socket(s):           2
NUMA node(s):        2
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39

When trying to run my HAProxy with the following config:

nbproc 2
nbthread 32
cpu-map auto:1/1-16 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
cpu-map auto:2/17-32 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

I get error:

config : cannot enable multiple processes if multiple threads are configured. Please use either nbproc or nbthread but not both.

from the source code, it is not possible to use nbproc>1 and nbthread>1:

 2213     if (global.nbproc > 1 && global.nbthread > 1) {
 2214         ha_alert("config : cannot enable multiple processes if multiple threads are configured. Please use either nbproc or nbthread but not both.\n");
 2215         err_code |= ERR_ALERT | ERR_FATAL;
 2216         goto out;
 2217     }

from Haproxy nbthread config it states using nbthread > This is exclusive with “nbproc .

I want to optimise Haproxy running on dual socket server and run it in 2 processes where each process is binded to it Numa cores and utilizing the CPU L3 cache. but it seems that it is now impossible in newer HAproxy version.

To get better performance I’m forced to use only one CPU and not both with:

nbproc 1
nbthread 16
cpu-map auto:1/1-16 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

How can I optimise HAproxy on 2 sockets CPU and bind each to its own Numa cores?

You already accepted the disadvantages of running nbproc > 1, so there is no point in running nbthread as well.

Just use a nbproc 40 with nbthread 1.

Thank you for you answer.
using more so many nbproc is not recommended as stated by Haproxy documentation

is strongly encouraged
to migrate away from “nbproc” to “nbthread”.

if you meant using nbproc=1 and nbthread40 without CPU affinity, it id indeed utilizing all CPU cores but it has worse performance than running on single CPU with 16 cores…

Yes, but you are specifically asking to run with nbproc.

nbproc is not recommended, because nbthread has advantages over using nbproc. But there is no difference between using nbproc 2 and nbproc 40. It would be recommended to not run nbproc at all, but you are specifically insisting to run nbproc because of NUMA, so it doesn’t make sense to not run it all the way.