Connections imbalance with multi process haproxy

Hello,
I have haproxy on a Intel® Xeon® CPU E5-2407 v2 @ 2.40GHz. It has one CPU with 4 cores.
The machine has two network interfaces eth0 and eth1 as bond0, active/passive. These are the interrupts stats:

# grep eth /proc/interrupts
  81:  212387530          0          0          0  IR-PCI-MSI-edge      eth0-tx-0
  82:  853666386          0          0          0  IR-PCI-MSI-edge      eth0-rx-1
  83:  144560404          0          0          0  IR-PCI-MSI-edge      eth0-rx-2
  84:  144618581          0          0          0  IR-PCI-MSI-edge      eth0-rx-3
  85:  157761343          0          0          0  IR-PCI-MSI-edge      eth0-rx-4
  86:     193216          0          0          0  IR-PCI-MSI-edge      eth1-tx-0
  87:  707629860          0          0          0  IR-PCI-MSI-edge      eth1-rx-1
  88:      85622          0          0          0  IR-PCI-MSI-edge      eth1-rx-2
  89:       8121          0          0          0  IR-PCI-MSI-edge      eth1-rx-3
  90:   12673724          0          0          0  IR-PCI-MSI-edge      eth1-rx-4

As you can see, IRQ happen on CPU 0, there’s no irqbalance. Also I haven’t done any manual IRQ pinning, it was like that by default. So I chose to put 3 haproxy on CPU 1, 2, 3 like this:

nbproc 3
cpu-map 1 1
cpu-map 2 2
cpu-map 3 3

I’ve checked that the haproxy processes actually belong to CPU 1, 2, 3 by reading /proc/$pid/task/*/status .

Problem

The number of connections, and hence the CPU usage, is not balanced among the three haproxy processes:

# ls /proc/23183/fd|wc -l
2381
# ls /proc/23184/fd|wc -l
636
# ls /proc/23185/fd|wc -l
626

What could I possibly doing wrong? What could be the problem? I’ve also tried with 4 processes pinned to CPU 0, 1, 2, 3 with a similar result. One process has most of connections compared to the other processes, and it’s always the lowest pid (first child forked I guess).

By default, the kernel wakes up all processes and they randomly pick the connections. Since kernel 3.9 the kernel queues incoming connections to each individual socket queue. But for this you need to add the “process” statement on your “bind” lines, which I’m pretty sure you don’t have :slight_smile:

Thanks a lot. I did it this way and seems to work:

frontend www
    bind :80 process 1
    bind :80 process 2
    bind :80 process 3

Is that correct? Also do you think it’s a valid setup in general to have IRQ pinned to a core, and then 3 haproxy on the other cores? To me it seems the load is quite distributed, at least for now.