Our HAProxy instance was under heavy load (32 threads and CPU usage was 3000+ for most of the time) and we suspected that it could be due to our clients not using TLS session resumption. After fixing the client-side and setting TLS session lifetime (tune.ssl.lifetime) to 1 day and increasing the cache size to 240 MB (20K clients * 200 bytes per entry = 4 MB << 240 MB), the CPU load reduced dramatically. However, we still see periodic (multiple times in a day) CPU spikes and high “%Tq” in the logs for some of the requests. We’re suspecting that TLS session resumption isn’t happening and that’s why CPU is spiking and hence high “%Tq”.
However, we couldn’t say for certain that TLS renegotiation or a new handshake happened for those requests since I don’t see a config to log that. Also, we don’t know why a renegotiation happened, if it did, before the lifetime (1 day) expired.
What would be the best way to debug this scenario? Any advice or help is highly appreciated.
HAProxy version: 1.8.20
Openssl version: 1.0.2k