This is a weird one and long shot to ask about here but I’m very much at the clutching at straws stage.
For a while now we’ve been seeing issues where randomly communication between our HAProxy node and the webserver node just drops for some time, typically around half an hour, and it just comes back like nothing was wrong.
More details on the situation:
- All of this is running on GCP
- HAProxy is running in Docker on it’s own Ubuntu 18.04 box
- The webserver is running natively on a different Ubuntu 18.04 box
I’ve already gone through a lot of diagnostics with the GCP folks and they’re saying it’s an instance level issue but I can’t see how. They specifically say when the issue is in effect the webserver node “never replies to any of the icmp requests that it receives” from HAProxy.
So as I say this is a long shot but has anyone seen anything like this. It really felt to me like some kind of limit was being reached somewhere that resulted in a block being applied but I’ve not found much at all so far.