I’d appreciate if anyone can give me couple of suggestions for the issue I have with SSL.
I know that sounds like certificate issue, but it happens only when I have big spike of new connections.
I am running haproxy 1.5.14 on Azure and using SSL termination.
Haproxy works perfectly well when load rises gradually, but everything goes bad if I have instant load.
In normal situation qmax goes up to 3000 and per thread and cpu core is loaded not higher than 75%.
So if I restart haproxy during daily load, haproxy might fill CPU usage up to 100% and be unable to handle more than 700-800 requests per thread.
When it comes to that limit, I see rate of new requests lowers down to 2-5
Haproxy log become mostly filled with
tls/1: SSL handshake failure errors.
If I add more haproxy instances into balance, it becomes normal.
I don’t have issues with entropy:
I tried to add conneciton rate limits:
that had no effect. Everything stops at about 800 connections and then whole log filled with SSL handshake failures.
I tried to play around with timeouts
timeout connect as:
Can anyone suggest anything here? I have no idea how to debug that.
Here is the config file I use:
global log /dev/log local0 log /dev/log local1 notice stats socket /var/run/haproxy.p1.sock mode 660 group nagios level admin process 1 stats socket /var/run/haproxy.p2.sock mode 600 level admin process 2 stats socket /var/run/haproxy.p3.sock mode 600 level admin process 3 stats socket /var/run/haproxy.p4.sock mode 600 level admin process 4 stats timeout 2m #Wait up to 2 minutes for input chroot /var/lib/haproxy user haproxy group haproxy daemon nbproc 4 cpu-map 1 0 # first arg is process number (1-based); second arg is cpu number (0-based) cpu-map 2 1 cpu-map 3 2 cpu-map 4 3
# SSL/TLS settings ca-base /etc/ssl/certs crt-base /etc/ssl/private tune.ssl.default-dh-param 2048 tune.ssl.cachesize 10000000 tune.ssl.lifetime 86400 #tune.ssl.maxrecord 2859 tune.ssl.maxrecord 1400 # TCP window size ssl-default-bind-options no-sslv3 no-tls-tickets ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-SHA!aNULL:!eNULL:!EXPORT:!DES:!RC4:!3DES:!MD5:!PSK
maxconn 60000 maxsslconn 60000 # maxsessrate 100 # maxsslrate 100 # maxconnrate 100
defaults log global option dontlognull option dontlog-normal timeout connect 5000 timeout client 50000 timeout server 50000 bind-process all # not needed, but worthwhile being explicit
listen stats bind :2100 process 1 bind :2101 process 2 bind :2102 process 3 bind :2103 process 4 mode http log global stats enable stats realm stats_process stats uri / stats refresh 15s stats show-legends stats show-node stats auth xxxxxxxxxxxxx
frontend tls mode tcp maxconn 60000 option tcplog bind *:443 ssl crt-list /etc/ssl/private/certificates.txt npn http/1.1 default_backend frontend_service
backend frontend_service mode tcp option tcplog option httpchk GET /status fullconn 60000
# 2 second 'inter'val between health checks. 2 failures to remove a server. 2 successes to add it back default-server inter 8s fall 2 rise 2 timeout check 8s
server SRV1 SRV1:80 maxconn 2000 check port 3000 .... server SRV60 SRV1:80 maxconn 2000 check port 3000