Control socket becomes stale on config reload

We’re connecting to the control socket to retrieve stats for the configured proxies / frontend / servers (running on HAProxy 3.3.5).

The current implementation keeps the UNIX socket connection open (via “prompt i”) and periodically requests “show stat -1 5 -1”. The socket timeout is greater than the polling interval, this works fine.

However, once there is a request to seamlessly reload the config via SIGHUP the stats will no longer be updated. I can see that this is because the old HAProxy workers are still alive to handle these socket connections whereas the new workers handle traffic.

Example before reload:

# lsof | grep haproxy.sock
...
haproxy   3827521                   cloud-user   81u     unix 0x0000000000000000        0t0   19842593 /run/haproxy/haproxy.sock.1.tmp type=STREAM (LISTEN)
haproxy   3827521                   cloud-user  111u     unix 0x0000000000000000        0t0   19842828 /run/haproxy/haproxy.sock.1.tmp type=STREAM (CONNECTED)
...
haproxy   3827521 3827530 haproxy   cloud-user   81u     unix 0x0000000000000000        0t0   19842593 /run/haproxy/haproxy.sock.1.tmp type=STREAM (LISTEN)
haproxy   3827521 3827530 haproxy   cloud-user  111u     unix 0x0000000000000000        0t0   19842828 /run/haproxy/haproxy.sock.1.tmp type=STREAM (CONNECTED)

After reload:

haproxy   3827521                   cloud-user  111u     unix 0x0000000000000000        0t0   19842828 /run/haproxy/haproxy.sock.1.tmp type=STREAM (CONNECTED)
haproxy   3827521 3827524 haproxy   cloud-user  111u     unix 0x0000000000000000        0t0   19842828 /run/haproxy/haproxy.sock.1.tmp type=STREAM (CONNECTED)
...
haproxy   3828134                   cloud-user   96u     unix 0x0000000000000000        0t0   19842593 /run/haproxy/haproxy.sock.1.tmp type=STREAM (LISTEN)
haproxy   3828134 3828141 haproxy   cloud-user   11u     unix 0x0000000000000000        0t0   19842593 /run/haproxy/haproxy.sock.1.tmp type=STREAM (LISTEN)

Here you can see PID 3827521 is still handling the control socket connections while the new worker was spawned with PID 3828134.

Did anyone experience something similar?

Is this expected or a bug? Can you think of a workaround (short of dropping the “prompt i” keep-alive)?

Thanks, buzz