HAProxy 2.2.4 SSL Handshake Failure

I’m getting a number of these per day, one burst every 5-10 minutes. I’ve been reluctant to change the SSL settings from standard to not risk angering the SSLLabs and other security metrics.

Compared to most, this system is not very busy, but has lots of many hours long connections vs millions on single transactions. We used to run haproxy with SSL pass thru. We converted to SSL termination in/out over the weekend and now are getting some reports that people can’t access the site, but haven’t gathered enough information to determine any commonalities or platforms or anything to debug with.

I have these settings in my global config for SSL:

    ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
    ssl-default-bind-options no-sslv3 no-tlsv10 no-tls-tickets

    tune.ssl.maxrecord 1460
    tune.ssl.lifetime 600
    tune.ssl.cachesize 1000000
    tune.ssl.default-dh-param 2048

I do have HTTP/2 enabled.

All of the errors come up with a “/2” after the site name:

Oct 15 22:24:14 firehawk haproxy[5229]: 203.188.238.29:13882 [15/Oct/2020:22:24:14.443] www.example.com/2: SSL handshake failure
Oct 15 22:24:14 firehawk haproxy[5229]: 203.188.238.29:13882 [15/Oct/2020:22:24:14.443] www.example.com/2: SSL handshake failure
Oct 15 22:24:22 firehawk haproxy[5229]: 203.188.238.29:14945 [15/Oct/2020:22:24:22.001] www.example.com/2: SSL handshake failure
Oct 15 22:24:22 firehawk haproxy[5229]: 203.188.238.29:14945 [15/Oct/2020:22:24:22.001] www.example.com/2: SSL handshake failure
Oct 15 22:24:22 firehawk haproxy[5229]: 203.188.238.29:15073 [15/Oct/2020:22:24:22.794] www.example.com/2: SSL handshake failure
Oct 15 22:24:22 firehawk haproxy[5229]: 203.188.238.29:15073 [15/Oct/2020:22:24:22.794] www.example.com/2: SSL handshake failure
Oct 15 22:24:34 firehawk haproxy[5227]: 203.188.238.29:17370 [15/Oct/2020:22:24:33.670] www.example.com/2: SSL handshake failure
Oct 15 22:24:34 firehawk haproxy[5227]: 203.188.238.29:17370 [15/Oct/2020:22:24:33.670] www.example.com/2: SSL handshake failure
Oct 15 22:24:34 firehawk haproxy[5229]: 203.188.238.29:17543 [15/Oct/2020:22:24:34.458] www.example.com/2: SSL handshake failure
Oct 15 22:24:34 firehawk haproxy[5229]: 203.188.238.29:17543 [15/Oct/2020:22:24:34.458] www.example.com/2: SSL handshake failure

Which leads me to believe this is an HTTP/2 issue, but I don’t see why they wouldn’t renegotiate as HTTP/1.1. We have OCSP stapling enabled, SSLLabs gives us an “A” – so all the usual SSL issues should be in good working order.

I don’t know how to turn on a log for cipher mismatch, but I am looking the cipher used on successful logins. These connections are being shut so hard, I wish there was more info.

Please, any advice on where to look or how to identify the kinds of clients having the issue would be greatly appreciated!

thanks in advance!

haproxy -vv
HA-Proxy version 2.2.4-1ppa1~bionic 2020/10/02 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2025.
Known bugs: http://www.haproxy.org/bugs/bugs-2.2.4.html
Running on: Linux 4.15.0-118-generic #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = gcc
CFLAGS = -O2 -g -O2 -fdebug-prefix-map=/build/haproxy-couRLx/haproxy-2.2.4=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-stringop-overflow -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 USE_ZLIB=1 USE_SYSTEMD=1

Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT +PCRE2 +PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS

Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=4).
Built with OpenSSL version : OpenSSL 1.1.1 11 Sep 2018
Running on OpenSSL version : OpenSSL 1.1.1 11 Sep 2018
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with network namespace support.
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity(“identity”), deflate(“deflate”), raw-deflate(“deflate”), gzip(“gzip”)
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.31 2018-02-12
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 7.5.0
Built with the Prometheus exporter as a service

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as cannot be specified using ‘proto’ keyword)
fcgi : mode=HTTP side=BE mux=FCGI
: mode=HTTP side=FE|BE mux=H1
h2 : mode=HTTP side=FE|BE mux=H2
: mode=TCP side=FE|BE mux=PASS

Available services :
prometheus-exporter

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
[CACHE] cache
[FCGI] fcgi-app

I don’t think /2 has anything to do with HTTP/2. If the SSL handshake fails, there is no HTTP involved at all.

First of all, share the entire configuration.

You probably have to remove no-tlsv10, since disabling TLSv1.0 will probably cause a few issues with older clients.

Here is the whole config, would be glad to hear anything else that needs adjustments besides the tickets issue. We did a packet sniff one of the persistent SSL handshake failures and it seems like a bot… it seemed like a properly formatted http1/1 connection, but not an SSL one on port 443.

thanks in advance!

global
    maxconn         100000
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 777 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon
    nbproc 4
    cpu-map 1 0
    cpu-map 2 1
    cpu-map 3 2
    cpu-map 4 3

    ca-base /etc/ssl/certs
    crt-base /etc/ssl/private



   ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-           SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM
-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
    ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
    ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets


    tune.ssl.maxrecord 1460
    tune.ssl.lifetime 600
    tune.ssl.cachesize 1000000
    tune.ssl.default-dh-param 2048



defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    option forwardfor
    option http-server-close
    option redispatch
    option log-separate-errors
     timeout client     1h
     timeout server     1h
     timeout connect 1000
    log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r %sslc"

    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

    stats enable
    stats hide-version
    stats scope .
    stats realm Haproxy\ Statistics
    stats uri /haproxy-stats?stats



   frontend dashboard.example.com
    bind *:80
    bind *:443 ssl crt /etc/apache2/sites-available/example.com/ssl/ssl-certs.pem ssl-min-ver TLSv1.2    alpn h2,h2c,http/1.1
    maxconn 10000
    compression algo gzip
    compression type text/html text/plain text/javascript application/javascript application/xml text/css
    option forwardfor
    option http-keep-alive
    timeout http-keep-alive 60000
    http-request set-header X-Forwarded-Port %[dst_port]
    http-request add-header X-Forwarded-Proto https if { ssl_fc }

    capture request header Referrer len 64
    capture request header Content-Length len 10
    capture request header User-Agent len 64
    http-request add-header  Strict-Transport-Security  max-age=15768000
    http-request redirect scheme https unless { ssl_fc }
    default_backend nodes

backend nodes
  mode http
  hash-type consistent

  option ssl-hello-chk




  balance roundrobin
  cookie BSERVERID insert indirect nocache
  server www3.example.com 10.0.0.195:443 ssl verify none alpn h2 cookie s3 maxconn 2000
  server www4.example.com 10.0.0.196:443 ssl verify none alpn h2 cookie s4 maxconn 2000




listen stats
        bind 127.0.0.1:8000

What ticket issue? I told you to not disable TLSv1.0, now you made it even worse by also disabling TLSv1.1 …

Yes, you will not reach zero errors on the Internet, that is for certain, because some handshakes will always fail.

You need to look actual client issues, not the presence of handshake errors.

Certainly. If we had access to the remote clients that were having these issues, we would do that.

We believe the complaints we were getting were from ISP issues far away from us. The SSL handshake issues are most likely bots or some other kind of miscreant.

We had an issue with connections being held in CLOSE_WAIT for 5+ days… we’ve added client request timeouts to hopefully close those idle connections faster in the process.

We are having to reset (hard restart) our haproxy software every 2-3 days otherwise we start seeing slow downs and pauses in performance. We are only running about 160 continuous connections (even with the laggards) and have allocated thousands… still tracking that issue down.

Thanks for the advice!

If you have a lot of sockets in CLOSE_WAIT state, that is an indication of a bug in haproxy, this should never happen.

Can you clarify whether the connections in CLOSE_WAIT are connection between clients and the frontend port, between haproxy and the backend server or maybe just health check connections?

If that is the case, I would ask you to file a bug report at github:

Yes. We can confirm the connections were between haproxy and the remote frontend clients. Not the back end clients.

We had set haproxy to automatically gracefully reload every 2-4 days… because we’d find it freezing unexpectedly. Counter to expectations, the processes never died. So we’d have processes from Oct 15, then from Oct 17, and Oct 19… and so on.

Now, we did NOT have a client timeout set at this point. So some lingering I guess could be expected. But we had Linux set to enforce keepalives to no effect.

In the process of analyzing this, we started using “ss -K” to kill first the established connections on the oldest processes… and then the close_wait ones… and expected the haproxy processes to terminate since they had no connections. But that didn’t occur either.

Eventually haproxy started hanging on new connections, and we did a hard restart with the client timeout.

Since that reload about 38 hours ago, we have zero connections stuck… but oddly enough we don’t even see the normal flow of connections in TIME_WAIT status. Everything on the server is either ESTABLISHED… or briefly in LAST_ACK and then disappears from netstat -n. I found one in TIME_WAIT between haproxy and a backend.

haproxy 31639 haproxy 25u IPv4 33468439 0t0 TCP haproxy.example.com:https->105.104.8.0:56003 (ESTABLISHED)
haproxy 31639 haproxy 26u IPv4 33465312 0t0 TCP haproxy.example.com:https->182-160-101-250.aamranetworks.com:44688 (ESTABLISHED)
haproxy 31640 haproxy 9u IPv4 33466117 0t0 TCP haproxy.example.com:https->156-155-164-178.ip.internet.co.za:60599 (ESTABLISHED)
haproxy 31770 haproxy 9u IPv4 36567343 0t0 TCP haproxy.example.com:https->49.33.167.53:51142 (ESTABLISHED)
haproxy 31771 haproxy 9u IPv4 34362254 0t0 TCP haproxy.example.com:https->47.9.216.123:37992 (ESTABLISHED)
haproxy 31771 haproxy 10u IPv4 35694875 0t0 TCP haproxy.example.com:https->40.84.132.122:61731 (ESTABLISHED)
haproxy 31771 haproxy 28u IPv4 36616880 0t0 TCP haproxy.example.com:https->1-169-155-103.dynamic-ip.hinet.net:58471 (ESTABLISHED)
haproxy 31771 haproxy 34u IPv4 36565858 0t0 TCP haproxy.example.com:https->49.33.167.53:51127 (ESTABLISHED)
haproxy 31771 haproxy 37u IPv4 36585952 0t0 TCP haproxy.example.com:https->49.33.167.53:51277 (ESTABLISHED)
haproxy 31772 haproxy 27u IPv4 33469196 0t0 TCP haproxy.example.com:https->182-160-101-250.aamranetworks.com:44744 (ESTABLISHED)
haproxy 31772 haproxy 39u IPv4 35768560 0t0 TCP haproxy.example.com:https->172.58.99.172:25855 (ESTABLISHED)
haproxy 31773 haproxy 14u IPv4 36292355 0t0 TCP haproxy.example.com:https->45.137.116.197:49983 (ESTABLISHED)
haproxy 13126 haproxy 16u IPv4 43549054 0t0 TCP haproxy.example.com:https->112.201.169.199.pldt.net:26238 (CLOSE_WAIT)
haproxy 13126 haproxy 25u IPv4 43555272 0t0 TCP haproxy.example.com:https->google-proxy-66-249-81-18.google.com:38050 (CLOSE_WAIT)
haproxy 13126 haproxy 53u IPv4 43549240 0t0 TCP haproxy.example.com:https->134.130.113.158:61415 (CLOSE_WAIT)
haproxy 13126 haproxy 57u IPv4 43549520 0t0 TCP haproxy.example.com:https->69-109-247-161.lightspeed.dybhfl.sbcglobal.net:53802 (CLOSE_WA
IT)
haproxy 13126 haproxy 65u IPv4 43555424 0t0 TCP haproxy.example.com:https->google-proxy-66-249-93-45.google.com:38662 (CLOSE_WAIT)
haproxy 19175 haproxy 9u IPv4 37734157 0t0 TCP haproxy.example.com:https->151.66.159.224:49343 (CLOSE_WAIT)
haproxy 19175 haproxy 18u IPv4 38826592 0t0 TCP haproxy.example.com:https->pool-100-8-53-253.nwrknj.fios.verizon.net:56646 (CLOSE_WAIT)
haproxy 19175 haproxy 20u IPv4 38325640 0t0 TCP haproxy.example.com:https->37.231.218.139.sta.wbroadband.net.au:49382 (CLOSE_WAIT)
haproxy 19175 haproxy 39u IPv4 38740486 0t0 TCP haproxy.example.com:https->default-rdns.vocus.co.nz:61596 (CLOSE_WAIT)
haproxy 19175 haproxy 44u IPv4 38976930 0t0 TCP haproxy.example.com:https->ppp14-2-73-169.adl-apt-pir-bras31.tpg.internode.on.net:59431 (
CLOSE_WAIT)
haproxy 19175 haproxy 45u IPv4 38976768 0t0 TCP haproxy.example.com:https->129.205.124.135:15988 (CLOSE_WAIT)
haproxy 19175 haproxy 51u IPv4 38976935 0t0 TCP haproxy.example.com:https->102-39-197-92.vox.co.za:51682 (CLOSE_WAIT)
haproxy 19175 haproxy 65u IPv4 38778778 0t0 TCP haproxy.example.com:https->200.198.60.112:21568 (CLOSE_WAIT)
haproxy 19175 haproxy 68u IPv4 38943524 0t0 TCP haproxy.example.com:https->96-91-8-137-static.hfc.comcastbusiness.net:55324 (CLOSE_WAIT)
haproxy 19175 haproxy 82u IPv4 38809553 0t0 TCP haproxy.example.com:https->118.100.222.200:56331 (CLOSE_WAIT)
haproxy 19175 haproxy 87u IPv4 38983809 0t0 TCP haproxy.example.com:https->dsl.49.145.163.117.pldt.net:54605 (CLOSE_WAIT)
haproxy 19175 haproxy 94u IPv4 38937366 0t0 TCP haproxy.example.com:https->ip-24-143-142-108.user.start.ca:57148 (CLOSE_WAIT)
haproxy 19178 haproxy 9u IPv4 38887874 0t0 TCP haproxy.example.com:https->90.254.157.187:57405 (CLOSE_WAIT)
haproxy 19178 haproxy 15u IPv4 38954122 0t0 TCP haproxy.example.com:https->c-24-19-90-120.hsd1.wa.comcast.net:52634 (CLOSE_WAIT)
haproxy 19179 haproxy 9u IPv4 38818712 0t0 TCP haproxy.example.com:https->221.124.210.21:51898 (CLOSE_WAIT)
haproxy 19179 haproxy 11u IPv4 36630708 0t0 TCP haproxy.example.com:https->103.251.142.26:53471 (CLOSE_WAIT)
haproxy 19179 haproxy 16u IPv4 37579731 0t0 TCP haproxy.example.com:https->node-103-77-45-128.alliancebroadband.in:38741 (CLOSE_WAIT)
haproxy 26282 haproxy 24u IPv4 43203374 0t0 TCP haproxy.example.com:https->116.74.170.170:62585 (CLOSE_WAIT)
haproxy 26282 haproxy 68u IPv4 43422848 0t0 TCP haproxy.example.com:https->c80-217-114-39.bredband.comhem.se:56134 (CLOSE_WAIT)
haproxy 26283 haproxy 9u IPv4 43481226 0t0 TCP haproxy.example.com:https->103.230.182.238:49806 (CLOSE_WAIT)
haproxy 30092 haproxy 22u IPv4 33447936 0t0 TCP haproxy.example.com:https->bras-base-barion1871w-grc-14-70-29-132-174.dsl.bell.ca:53690 (
CLOSE_WAIT)
haproxy 30092 haproxy 23u IPv4 33451010 0t0 TCP haproxy.example.com:https->213.205.198.157:60816 (CLOSE_WAIT)
haproxy 30092 haproxy 26u IPv4 33451240 0t0 TCP haproxy.example.com:https->182-160-101-250.aamranetworks.com:44444 (CLOSE_WAIT)
haproxy 30092 haproxy 27u IPv4 33451269 0t0 TCP haproxy.example.com:https->p5dc59e98.dip0.t-ipconnect.de:54888 (CLOSE_WAIT)
haproxy 30092 haproxy 32u IPv4 33451022 0t0 TCP haproxy.example.com:https->bras-base-barion1871w-grc-14-70-29-132-174.dsl.bell.ca:53692 (
CLOSE_WAIT)
haproxy 30092 haproxy 36u IPv4 33451030 0t0 TCP haproxy.example.com:https->105.104.8.0:55708 (CLOSE_WAIT)
haproxy 30092 haproxy 37u IPv4 33451032 0t0 TCP haproxy.example.com:https->105.104.8.0:55709 (CLOSE_WAIT)
haproxy 30092 haproxy 43u IPv4 33451057 0t0 TCP haproxy.example.com:https->95.211.230.211:36818 (CLOSE_WAIT)
haproxy 30094 haproxy 21u IPv4 33456734 0t0 TCP haproxy.example.com:https->2.50.64.10:54194 (CLOSE_WAIT)
haproxy 30094 haproxy 22u IPv4 33450100 0t0 TCP haproxy.example.com:https->213.205.198.201:53649 (CLOSE_WAIT)
haproxy 30094 haproxy 27u IPv4 33450114 0t0 TCP haproxy.example.com:https->42.4.33.25:7751 (CLOSE_WAIT)
haproxy 30094 haproxy 31u IPv4 33450457 0t0 TCP haproxy.example.com:https->39.35.69.15:27809 (CLOSE_WAIT)
haproxy 30094 haproxy 32u IPv4 33450548 0t0 TCP haproxy.example.com:https->182-160-101-250.aamranetworks.com:44442 (CLOSE_WAIT)
haproxy 30098 haproxy 21u IPv4 33452034 0t0 TCP haproxy.example.com:https->42.4.33.25:7750 (CLOSE_WAIT)
haproxy 30506 haproxy 10u IPv4 33458971 0t0 TCP haproxy.example.com:https->bras-base-barion1871w-grc-14-70-29-132-174.dsl.bell.ca:53697 (
CLOSE_WAIT)
haproxy 30506 haproxy 11u IPv4 33459167 0t0 TCP haproxy.example.com:https->42.4.33.25:8078 (CLOSE_WAIT)
haproxy 30509 haproxy 11u IPv4 33459741 0t0 TCP haproxy.example.com:https->39.35.69.15:27933 (CLOSE_WAIT)
haproxy 30509 haproxy 13u IPv4 33460117 0t0 TCP haproxy.example.com:https->151.66.159.224:63315 (CLOSE_WAIT)
haproxy 31639 haproxy 15u IPv4 33465274 0t0 TCP haproxy.example.com:https->39.35.69.15:27965 (CLOSE_WAIT)
haproxy 31639 haproxy 24u IPv4 33465281 0t0 TCP haproxy.example.com:https->95.211.230.211:37784 (CLOSE_WAIT)
haproxy 31639 haproxy 34u IPv4 33468441 0t0 TCP haproxy.example.com:https->151.66.159.224:61401 (CLOSE_WAIT)
haproxy 31640 haproxy 10u IPv4 33466141 0t0 TCP haproxy.example.com:https->bras-base-barion1871w-grc-14-70-29-132-174.dsl.bell.ca:53699 (
CLOSE_WAIT)

Here is an excerpt of the result from lsof before the restart (its been sorted by status). The oldest processes were October 15, then the next batch Oct 17, and so on.