HAProxy httpcheck strange behavior

There is HAProxy server (version 1.8). Recently I configured http healthcheck to MS Exchange 2016 (configuration was taken from https://bidhankhatri.com.np/system/haproxy-configuration-for-windows-exchange-server-2016-and-2019/).

Now I try to update version of HAProxy to LTS on Debian 11 (from stable repository version is 2.2.9).

After all updates in confiration file, I have successful validation check. But now I constantly get log warning:

Dec 12 16:16:00 lb-03 haproxy[100159]: [WARNING]  (100159) : Health check for server B-EXCHANGE-MAPI/mail-03-mapi failed, reason: Layer4 timeout, check duration: 2001ms, status: 2/3 UP.
Dec 12 16:16:02 lb-03 haproxy[100159]: [WARNING]  (100159) : Health check for server B-EXCHANGE-MAPI/mail-03-mapi succeeded, reason: Layer7 check passed, code: 200, check duration: 2ms, status: 3/3 UP.
Dec 12 16:16:05 lb-03 haproxy[100159]: [WARNING]  (100159) : Health check for server B-EXCHANGE-EWS/mail-03-ews failed, reason: Layer4 timeout, check duration: 2000ms, status: 2/3 UP.
Dec 12 16:16:07 lb-03 haproxy[100159]: [WARNING]  (100159) : Health check for server B-EXCHANGE-EWS/mail-03-ews succeeded, reason: Layer7 check passed, code: 200, check duration: 1ms, status: 3/3 UP.
Dec 12 16:18:42 lb-03 haproxy[100159]: [WARNING]  (100159) : Health check for server B-EXCHANGE-ECP/mail-05-ecp failed, reason: Layer4 timeout, check duration: 2001ms, status: 2/3 UP.
Dec 12 16:18:44 lb-03 haproxy[100159]: [WARNING]  (100159) : Health check for server B-EXCHANGE-ECP/mail-05-ecp succeeded, reason: Layer7 check passed, code: 200, check duration: 2ms, status: 3/3 UP.

I try to increse check interval – get fewer messages but they are not dissapire.
Installed version 2.4 from backports – nothing changes.
Dump of traffic doesn’t show any errors.

Backends with http check but are had two servers in backend don’t get this errors.

All these backens point on same two MS Exchange servers and each of them has http check:

use_backend B-EXCHANGE-OWA if A-EXCHANGE-HOSTNAME owa
use_backend B-EXCHANGE-AUTODISCOVER if A-EXCHANGE-HOSTNAME autodiscover
use_backend B-EXCHANGE-MAPI if A-EXCHANGE-HOSTNAME mapi
use_backend B-EXCHANGE-ACTIVESYNC if A-EXCHANGE-HOSTNAME eas
use_backend B-EXCHANGE-EWS if A-EXCHANGE-HOSTNAME ews
use_backend B-EXCHANGE-ECP if A-EXCHANGE-HOSTNAME ecp
use_backend B-EXCHANGE-RPC if A-EXCHANGE-HOSTNAME rpc
use_backend B-EXCHANGE-OAB if A-EXCHANGE-HOSTNAME oab

I have no idea what else can be done.

At a glance, I would try increasing the timeout allowed by healthchecks and see if that alleviates this error. Something like this:

backend be_ex2019_activesync
	mode http
	timeout check 5s

Source: HAProxy 2.2 Configuration Manual - timeout check

изображение
I have a default “timeout check 10s”

When I add source IP and port range to “server” option warning logs is dissapeared:

And I can’t explain why.

Now I have the same problem

By default the health check timeout is configured by inter option on the server line. That is the reason why you see 2 sec timeouts. There is also option timeout check - pls read it there is explained behaviour because it isn’t used for connection timeout. I would recommend to increase inter for testing purposes.

I thought that “inter” is the time between requests. I tried to increase the interval to 15 seconds - the errors began to appear much less often, once every few hours, not ten times per hour.

I thought that “inter” is the time between requests.

It is but it is also used for connection timeout in some cases:

If set, HAProxy uses min("timeout connect", "inter") as a connect timeout
for check and "timeout check" as an additional read timeout.

However it sounds pretty weird that your connection timeout even with 15 sec.
Is the server overloaded or so ? it doesn’t look healthy …

Is the server overloaded or so?

It isn’t. I have a production HAProxy (ver. 1.8) and it has default configuration of timeouts and the same configuration of health check and no errors at all.

When I add source IP and port range to “server” option warning logs is dissapeared:

I missed this comment. Is it still the case that source fix your issue ?
It is possible that there is something wrong with network, firewall or ipv4/6 ?

Is it still the case that source fix your issue?
Yes, it is. I have a source IP in global configuration because I use keepalived and the same source IP address in backend section fix issue with partioal Layer4 healthcheck.

It is possible that there is something wrong with network, firewall or ipv4/6?
I don’t disable ipv6 but I don’t use IPv6 at all. Firewall and network is checked first and I don’t find out any issue.