Packets dropping randomly for some time between HAProxy docker and ubuntu instance?

Hey

This is a weird one and long shot to ask about here but I’m very much at the clutching at straws stage.

For a while now we’ve been seeing issues where randomly communication between our HAProxy node and the webserver node just drops for some time, typically around half an hour, and it just comes back like nothing was wrong.

More details on the situation:

  • All of this is running on GCP
  • HAProxy is running in Docker on it’s own Ubuntu 18.04 box
  • The webserver is running natively on a different Ubuntu 18.04 box

I’ve already gone through a lot of diagnostics with the GCP folks and they’re saying it’s an instance level issue but I can’t see how. They specifically say when the issue is in effect the webserver node “never replies to any of the icmp requests that it receives” from HAProxy.

So as I say this is a long shot but has anyone seen anything like this. It really felt to me like some kind of limit was being reached somewhere that resulted in a block being applied but I’ve not found much at all so far.

Cheers,
gy

Obviously posting this issue gave me luck. Been a problem now for over 2 months and today I finally found the problem - sshguard.

This fuzzy spam blocker tool appears to included by default in GCP’s images and in this case it was taking a disliking to HAProxy’s health check requests. I knew it would be something like this but couldn’t find the smoking gun, now I have. Hope this helps others!