Weird state with peers not reloading

lucid_thayne · March 2, 2024, 12:34am

I have a cluster of load balancers, with a peers section that looks like:

peers  lbs
  peer 10.0.0.1:29000
  peer 10.0.0.2:29000
  peer 10.0.0.3:29000

and in one of the backends a stick table is defined like:

stick-table type string len 36 size 10k expire 6m peers lbs srvkey addr

Usually this works fine. But I recently ran into an issue where it didn’t seem to be respecting the stick table across the peers. While troubleshooting I tried using the show peers command on the stats socket CLI, and it returned back an empty string. If I tried show peers lbs it told be that peer group didn’t exist. I will henceforth refer to this state as the “bad state”.

I tried doing a reload, both by sending SIGUSR2 to the master process (in master-worker mode), and by sending the reload command to the master over the master cli. But that didn’t change anything. However, if I completely stopped and then restarted the haproxy process (master and worker), then it gets back into a healthy state with the peers listed and stick tables working. And then once in this state, if i make changes to the peers section and reload, those changes apply.

I’ve tried to reproduce this with a more minimal configuration, but so far have been successful. In fact, even with my full configuration I can’t get it into this state again. But it has happened on multiple servers for me, and I’m worried it might happen again.

Anyone have any ideas on what might have caused this or how to prevent it from happening again?

lucid_thayne · March 12, 2024, 7:09pm

I don’t entirely understand what was happening here, but I did discover that the script that was starting haproxy was initially using the wrong ip (which we also used as the peername) in the -L option when it started. It was using the ip from the host that the image for the VM was created from. After startup, the script was updated to use the correct ip, but there was a race condition where that didn’t necessarily happen before haproxy was started.

So I think this might happen if you use the -L option with a peer that doesn’t exist.

Topic		Replies	Views
Stick tables and reloads Help!	1	615	August 11, 2020
New session sent to incorrect backend after soft reload Help!	1	763	July 12, 2016
Are sticky sessions preserved on reloads? Help!	1	472	March 10, 2020
Stick table clear not working properly Help!	0	205	April 5, 2024
Peers section logging Help!	2	475	January 19, 2021

Weird state with peers not reloading

Related topics