Hi everyone
I work for a large-ish UK web hosting company and we’re slowly introducing HTTP/2 now that’s it’s in haproxy 1.8.
We’re using haproxy 1.8.3 under SystemD on CentOS 7.4 (3.10.0-693.el7.x86_64) KVM virtual machines, built with the following make line:
%{__make} CPU="generic" TARGET=linux2628 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_SYSTEMD=1 USE_PCRE=1 USE_PCRE_JIT=1
HTTP/2 is enabled for both of our frontend’s IPv4 and IPv6 “bind” lines:
frontend httpexternal
bind *:80
bind *:443 ssl crt /etc/haproxy/stackcerts/www.stackssl.com.pem ssl crt /etc/haproxy/certs/ alpn h2,http/1.1
bind :::80
bind :::443 ssl crt /etc/haproxy/stackcerts/www.stackssl.com.pem ssl crt /etc/haproxy/certs/ alpn h2,http/1.1
We have observed behavior where haproxy reloads cause a situation where the “finishing” PID never actually finishes, despite lsof showing that it has no active TCP connections.
If we disable HTTP/2 completely (IPv4 and IPv6) the problem goes away, and the finishing PIDs do indeed go away when they have no more connections.
After lots of lsof’ing later, we noticed that a child we expect to have finished had this line in lsof output:
haproxy 26670 haproxy 429u sock 0,7 0t0 31631114 protocol: TCPv6
but no actual IPv6 connections. We see the same behavior for “finishing” processes that never go away that are for IPv4. It has a high-numbered File Descriptor which makes me think it was from a connection that was being used to serve HTTP requests.
The same behaviour occurs if we’re in single-threaded or multi-threaded mode (modified through config, not recompiling).
We think http://git.haproxy.org/?p=haproxy-1.8.git;a=commit;h=4dbce456a223de3d06873828185ba789d5043def might be related in some way.
I hope this report helps.
Please let me know if you require any further information.