HAProxy community

Source IP from haproxy > server wrong


#1

I have a small Haproxy server set up with 2 NICs. The OS is CentOS7 and I have configured both NICs on the same subnet per CentOS documentation.

— 192.168.0.1, 192.168.0.2 ---- both on 192.168.0.0/24
192.168.0.1 is used for management and the web gui, 192.168.0.2 is used for the LB traffic.

Traffic coming into the LB hits the 192.168.0.2 address, but seems to be egressing to the 192.168.0.1 address to the backend servers. I’ve tried specifying “source IP” in the config to no avail.

Version is 1.8.16.

If I do the telnet from the LB CLI inside Linux, the routing seems to work correctly, so it seems the LB application itself isn’t going out on the correct interface/IP.


#2

It’s a bad idea to configure both NICs in the same subnet, because the egress NIC and source IP are not deterministically chosen by the kernel in this case.

Your expectation that 192.168.0.2 is always used for to egress traffic is something that your OS/kernel doesn’t know about.

Now specifying the source IP in haproxy should theoretically alleviate that. Can you provide the actual, complete haproxy configuration you used for this? Also provide the output of running haproxy through strace -tt -p<PID> while using the source IP option (and running traffic through it).

I strongly suggest do rethink your configuration though, and put your management IP into a different subnet.

The configuration is indeterministic, so the behavior of the sockets is basically random.


#3

Haproxy config:

global
daemon
stats socket /var/run/haproxy.sock mode 777 level admin
maxconn 4096
maxcompcpuusage 100
maxcomprate 0
nbproc 1
ssl-server-verify required
defaults
mode tcp
option http-server-close
option redispatch
retries 3
timeout connect 5000
timeout server 50000
timeout client 50000
timeout check 50000
timeout http-keep-alive 50000
timeout http-request 50000

listen LBPROXY

bind 192.168.0.2:999
balance source
maxconn 9999
mode tcp
source 192.168.0.2
option http-server-close
server server1 192.168.0.3:999 check fall 3 rise 5 inter 2000 weight 10

I agree, except I’ve created routing tables and rules that should’ve curved this behavior. (https://access.redhat.com/solutions/30564)

I tried running the strace - I don’t see any reference of the actual iPs or interfaces outbound being used. Is there a certain line or reference you’re looking for in that output?


#4

Ok, but you have to realize that this has nothing to do with haproxy. Haproxy creates a socket, and uses that socket. If you configure haproxy with a specific source IP, haproxy will tell the kernel to use that source IP. All the complexity of source IP selection is in the kernel.

If you configured different routing tables, please provide the full output of those.

The strace output I have with your config is the following:

18:32:35.577932 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 14
18:32:35.578074 fcntl(14, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
18:32:35.578188 setsockopt(14, SOL_TCP, TCP_NODELAY, [1], 4) = 0
18:32:35.578326 setsockopt(14, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0
18:32:35.578474 setsockopt(14, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
18:32:35.578603 bind(14, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.0.2")}, 16) = 0
18:32:35.578726 connect(14, {sa_family=AF_INET, sin_port=htons(999), sin_addr=inet_addr("192.168.0.3")}, 16) = -1 EINPROGRESS (Operation now in progress)
18:32:35.579033 sendto(14, "GET / HTTP/1.1\r\nHost: 192.168.0."..., 79, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = -1 ECONNREFUSED (Connection refused)
18:32:35.579237 setsockopt(14, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0
18:32:35.579414 close(14)               = 0

You can see the bind call where the socket is bound to the source IP 192.168.0.2 (and the syscall returns success).


#5

this is all I’m getting, no source IPs anywhere.

11:52:28.164028 epoll_wait(3, [], 200, 923) = 0
11:52:29.088581 epoll_wait(3, [], 200, 0) = 0
11:52:29.089008 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 8
11:52:29.089475 fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
11:52:29.090015 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
11:52:29.090370 setsockopt(8, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
11:52:29.090786 connect(8, {sa_family=AF_INET, sin_port=htons(999), sin_addr=inet_addr("192.168.0.3")}, 16) = -1 EINPROGRESS (Operation now in progress)
11:52:29.091360 epoll_wait(3, [], 200, 0) = 0
11:52:29.091869 connect(8, {sa_family=AF_INET, sin_port=htons(999), sin_addr=inet_addr("192.168.0.3")}, 16) = 0
11:52:29.092317 recvfrom(8, NULL, 2147483647, MSG_TRUNC|MSG_DONTWAIT|MSG_NOSIGNAL, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
11:52:29.092718 setsockopt(8, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0
11:52:29.093089 close(8)                = 0
11:52:29.093501 epoll_wait(3, [], 200, 0) = 0
11:52:29.093908 epoll_wait(3, [{EPOLLIN, {u32=5, u64=5}}], 200, 1000) = 1
11:52:29.876320 accept4(5, {sa_family=AF_INET, sin_port=htons(50449), sin_addr=inet_addr("192.168.0.161")}, [16], SOCK_NONBLOCK) = 8
11:52:29.876772 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
11:52:29.877128 accept4(5, 0x7fff9acd8b80, 0x7fff9acd8b78, SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
11:52:29.877596 recvfrom(8, 0xc19994, 15360, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
11:52:29.877986 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 9
11:52:29.878352 fcntl(9, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
11:52:29.878735 setsockopt(9, SOL_TCP, TCP_NODELAY, [1], 4) = 0
11:52:29.879107 connect(9, {sa_family=AF_INET, sin_port=htons(999), sin_addr=inet_addr("192.168.0.3")}, 16) = -1 EINPROGRESS (Operation now in progress)
11:52:29.879681 epoll_ctl(3, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLRDHUP, {u32=8, u64=8}}) = 0
11:52:29.880029 epoll_wait(3, [], 200, 0) = 0
11:52:29.880366 connect(9, {sa_family=AF_INET, sin_port=htons(999), sin_addr=inet_addr("192.168.0.3")}, 16) = 0
11:52:29.880777 epoll_wait(3, [], 200, 0) = 0
11:52:29.881179 recvfrom(9, "220 server.domain.com Mic"..., 16384, 0, NULL, NULL) = 100
11:52:29.881585 recvfrom(9, 0xc199f8, 16284, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
11:52:29.881946 sendto(8, "220 server.domain.com Mic"..., 100, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 100
11:52:29.882354 epoll_ctl(3, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLRDHUP, {u32=9, u64=9}}) = 0
11:52:29.882776 epoll_wait(3, [], 200, 0) = 0
11:52:29.883132 epoll_wait(3, [], 200, 1000) = 0
11:52:30.884648 epoll_wait(3, [], 200, 205) = 0
11:52:31.090235 epoll_wait(3, [], 200, 2) = 0
11:52:31.092606 epoll_wait(3, [], 200, 0) = 0
11:52:31.092948 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 10
11:52:31.093354 fcntl(10, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
11:52:31.093784 setsockopt(10, SOL_TCP, TCP_NODELAY, [1], 4) = 0
11:52:31.094176 setsockopt(10, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
11:52:31.094582 connect(10, {sa_family=AF_INET, sin_port=htons(999), sin_addr=inet_addr("192.168.0.3")}, 16) = -1 EINPROGRESS (Operation now in progress)
11:52:31.095057 epoll_wait(3, [], 200, 0) = 0
11:52:31.095402 connect(10, {sa_family=AF_INET, sin_port=htons(999), sin_addr=inet_addr("192.168.0.3")}, 16) = 0
11:52:31.095819 recvfrom(10, NULL, 2147483647, MSG_TRUNC|MSG_DONTWAIT|MSG_NOSIGNAL, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
11:52:31.096194 setsockopt(10, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0
11:52:31.096602 close(10)               = 0
11:52:31.096854 epoll_wait(3, [], 200, 0) = 0
11:52:31.097276 epoll_wait(3, ^Cstrace: Process 6154 detached
# ip route show table eth0table
default via 192.168.0.0 dev eth0 
192.168.0.0/24 dev eth0 scope link src 192.168.0.1
# ip route show table eth1table
default via 192.168.0.0 dev eth1 
192.168.0.0/24 dev eth0 scope link src 192.168.0.2 
0:  from all lookup local 
32762:  from all to 192.168.0.2 lookup eth1table 
32763:  from 192.168.0.2 lookup eth1table 
32764:  from all to 192.168.0.1 lookup eth0table 
32765:  from 192.168.0.1 lookup eth0table 
32766:  from all lookup main 
32767:  from all lookup default 

Also, I realize this is probably my own lack of understanding of the routing on the Linux box, so I really appreciate your help, @lukastribus


#6

The process with PID 6154 from the strace is not running the configuration you provided. Not only doesn’t it set the source-IP, it is also setting other options that you didn’t even configure (you did not configure option tcp-smart-connect but haproxy is setting TCP_QUICKACK on the outgoing socket).

I suggest you triple check what configuration you are running and make sure you don’t have any old haproxy processes running in the background. It’s best to shutdown haproxy, confirm that nothing is responding anymore on port 999, and only then - when you are sure nothing is listening on port 999 you are restarting haproxy.

I don’t see how the IP route configuration is supposed to solve the problem, in fact it actually requires the local IP address to be known as it picks the outgoing interface based on that IP, but selecting that IP is the actual problem. But let’s focus on haproxy, I don’t see how ip route tables can solve this problem anyway.


#7

I triple checked the config and as suspected, an old instance was running and causing the issues. Cleaned and rebooted and things are working as expected. Thanks for the time, updating as hopefully this helps another rookie down the road should someone stumble upon it.