HAProxy stops forwarding syslog

I configured SYSLOG to two servers in round-robin, and now I’m trying to work out why HAProxy stops using a server when the receiving service there is restarted. Logs don’t show any up/down activity for a mode log backend.

Firewall/Appliance forwards to logs to HAProxy, which has two tcp servers configured where Elastic Agent parses the data to Elasticsearch. Missing data in Elastic != good…

HAProxy config:

global
    log /dev/log        local0
    log /dev/log        local1 notice

defaults
    log                     global
    mode                    http
    retries                 3
    timeout queue           1m
    timeout connect         5s
    timeout client          50s
    timeout server          50s
    timeout check           10s
    maxconn                 3000

log-forward syslog
    bind                10.2.2.5:1514
    log                 backend@panoslog user

backend panoslog
    mode        log
    balance     roundrobin

    default-server inter 1s fastinter 500ms fall 3 rise 2 on-error mark-down log-bufsize 8388608
    server log0 10.2.2.51:9001 log-proto octet-count check
    server log1 10.2.2.52:9001 log-proto octet-count check

This same HAProxy instance has stats enabled and load-balances MySQL without missing a beat.

Nothing appears in the logs about the backend servers. I added bufsize and on-error to see if they would help mitigate the missing data.

haproxy version is 3.0.12-1ppa1~jammy

Restarting HAProxy fixes it until the receiving server is reloaded. It appears that HAProxy continues to send data to the ringbuffer, but fails to forward it to the server and then drops it?! I can’t prove this as all I have is seeing less events in Elastic when a server is affected.

Using tcpdump I can see that the server responds with a tcp RESET to received SYN packets until HAProxy is restarted. If I add fastinter 500ms observe layer4 error-limit 2 then the server is detected as down, but quickly comes back up while it’s still sending tcp RESET to HAProxy’s SYN packets.

Oct 08 18:17:17 ha.domain.com haproxy[285424]: [WARNING]  (285424) : Server panoslog/log1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 1 active and 0 backup servers left. 1 sessions active, 0 requeued, 0 remaining in queue.
Oct 08 18:17:17 ha.domain.com haproxy[285424]: Server panoslog/log1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 1 active and 0 backup servers left. 1 sessions active, 0 requeued, 0 remaining in queue.
Oct 08 18:17:17 ha.domain.com haproxy[285424]: Server panoslog/log1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 1 active and 0 backup servers left. 1 sessions active, 0 requeued, 0 remaining in queue.
Oct 08 18:17:20 ha.domain.com haproxy[285424]: [WARNING]  (285424) : Server panoslog/log1 is UP, reason: Layer4 check passed, check duration: 0ms. 2 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
Oct 08 18:17:20 ha.domain.com haproxy[285424]: Server panoslog/log1 is UP, reason: Layer4 check passed, check duration: 0ms. 2 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
Oct 08 18:17:20 ha.domain.com haproxy[285424]: Server panoslog/log1 is UP, reason: Layer4 check passed, check duration: 0ms. 2 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.

If this is an issue with Elastic Agent, then why can it be fixed by restarting HAProxy? If it’s HAProxy, then why isn’t the server marked and kept down when it sends a tcp RESET back?

This sounds like the TCP backend logging connection is opened once and never recreated, even when the connection fails completely. That is bad.

Please file a bug at:

1 Like

Thank you, bug filed #3153

1 Like