After some days or weeks normal working Haproxy starts to ignore 30-80% new connections
- No kernel-level errors (even local connections affected)
- No problems with any counters (“clear counters” cli-command do nothing) or fd, sockets and other limits
- Nothing interesting in haproxy stats
- It is rare
- It happens randomly, but have obvious connection to significant load and low (but not overloaded) CPU.
- Appears only on multi-process configs (with extremely often reloads and huge configs)
- Tested up to 1.8.15 (Debian 8, kernel 4.17+)
- Fixed easily by restart
TCP dump for normal request:
19:26:30.457523 IP localhost.59792 > localhost.https: Flags [S], seq 765380153, win 43690, options [mss 65495,sackOK,TS val 448339641 ecr 0,nop,wscale 10], length 0
19:26:30.457549 IP localhost.https > localhost.59792: Flags [S.], seq 2803598966, ack 765380154, win 43690, options [mss 65495,sackOK,TS val 448339641 ecr 448339641,nop,wscale 10], length 0
19:26:30.457596 IP localhost.59792 > localhost.https: Flags [.], ack 1, win 43, options [nop,nop,TS val 448339641 ecr 448339641], length 0
19:26:30.457827 IP localhost.59792 > localhost.https: Flags [P.], seq 1:189, ack 1, win 43, options [nop,nop,TS val 448339641 ecr 448339641], length 188
19:26:30.458425 IP localhost.https > localhost.59792: Flags [P.], seq 1:2749, ack 189, win 44, options [nop,nop,TS val 448339642 ecr 448339641], length 2748
19:26:30.458457 IP localhost.59792 > localhost.https: Flags [.], ack 2749, win 171, options [nop,nop,TS val 448339642 ecr 448339642], length 0
19:26:30.460073 IP localhost.59792 > localhost.https: Flags [P.], seq 189:282, ack 2749, win 171, options [nop,nop,TS val 448339643 ecr 448339642], length 93
19:26:30.460602 IP localhost.https > localhost.59792: Flags [P.], seq 2749:2800, ack 282, win 44, options [nop,nop,TS val 448339644 ecr 448339643], length 51
19:26:30.460720 IP localhost.59792 > localhost.https: Flags [P.], seq 282:388, ack 2800, win 171, options [nop,nop,TS val 448339644 ecr 448339644], length 106
19:26:30.460958 IP localhost.https > localhost.59792: Flags [P.], seq 2800:3017, ack 388, win 44, options [nop,nop,TS val 448339644 ecr 448339644], length 217
19:26:30.460994 IP localhost.https > localhost.59792: Flags [P.], seq 3017:3048, ack 388, win 44, options [nop,nop,TS val 448339644 ecr 448339644], length 31
19:26:30.461016 IP localhost.https > localhost.59792: Flags [F.], seq 3048, ack 388, win 44, options [nop,nop,TS val 448339644 ecr 448339644], length 0
19:26:30.461022 IP localhost.59792 > localhost.https: Flags [.], ack 3048, win 176, options [nop,nop,TS val 448339644 ecr 448339644], length 0
19:26:30.461065 IP localhost.59792 > localhost.https: Flags [P.], seq 388:419, ack 3049, win 176, options [nop,nop,TS val 448339644 ecr 448339644], length 31
19:26:30.461137 IP localhost.59792 > localhost.https: Flags [F.], seq 419, ack 3049, win 176, options [nop,nop,TS val 448339644 ecr 448339644], length 0
19:26:30.461142 IP localhost.https > localhost.59792: Flags [R.], seq 3049, ack 419, win 44, options [nop,nop,TS val 448339644 ecr 448339644], length 0
19:26:30.461161 IP localhost.https > localhost.59792: Flags [R], seq 2803602015, win 0, length 0
TCP dump for failed request:
19:27:27.576816 IP localhost.65360 > localhost.https: Flags [S], seq 578543461, win 43690, options [mss 65495,sackOK,TS val 448396759 ecr 0,nop,wscale 10], length 0
19:27:27.576842 IP localhost.https > localhost.65360: Flags [S.], seq 3967261380, ack 578543462, win 43690, options [mss 65495,sackOK,TS val 448396759 ecr 448396759,nop,wscale 10], length 0
19:27:27.576867 IP localhost.65360 > localhost.https: Flags [.], ack 1, win 43, options [nop,nop,TS val 448396759 ecr 448396759], length 0
19:27:27.577082 IP localhost.65360 > localhost.https: Flags [P.], seq 1:189, ack 1, win 43, options [nop,nop,TS val 448396759 ecr 448396759], length 188
19:27:27.617581 IP localhost.https > localhost.65360: Flags [.], ack 189, win 44, options [nop,nop,TS val 448396799 ecr 448396759], length 0
…client-side timeout - 30 seconds…
19:27:57.606808 IP localhost.65360 > localhost.https: Flags [F.], seq 189, ack 1, win 43, options [nop,nop,TS val 448426787 ecr 448396799], length 0
19:27:57.649591 IP localhost.https > localhost.65360: Flags [.], ack 190, win 44, options [nop,nop,TS val 448426831 ecr 448426787], length 0