Tons of "ssl_termination/1: SSL handshake failure"

I am using HAProxy 1.8.20 with an 2048 bit certificate from Let’s encrypt. SSL labs has confirmed that the certificate is OK (full certificate chain). However, I still get tons of “SSL handshake failures” in my log. I have tried the suggestions from other threads in this forum but they did not solve my issue.

Any ideas what I can do to resolve the issue? Below are the error messages (small clip from system log, haproxy -vv and my haproxy.conf). Any help would really be appreciated!

Fri Sep 20 06:12:43 2019 local0.info haproxy[13893]: 2401:7400:c802:925e:190:75bb:1f07:55f2:58170 [20/Sep/2019:06:12:43.877] ssl_termination/1: Connection closed during SSL handshake
Fri Sep 20 06:12:43 2019 local0.info haproxy[13893]: ::ffff:192.168.1.233:50590 [20/Sep/2019:06:12:43.889] ssl_termination/1: SSL handshake failure
Fri Sep 20 06:12:43 2019 local0.info haproxy[13893]: ::ffff:192.168.1.233:50592 [20/Sep/2019:06:12:43.955] ssl_termination/1: SSL handshake failure
Fri Sep 20 06:12:44 2019 local0.info haproxy[13893]: ::ffff:192.168.1.233:50594 [20/Sep/2019:06:12:44.014] ssl_termination/1: SSL handshake failure
Fri Sep 20 06:12:44 2019 local0.info haproxy[13893]: ::ffff:192.168.1.233:50596 [20/Sep/2019:06:12:44.044] ssl_termination/1: SSL handshake failure
Fri Sep 20 06:12:44 2019 local0.info haproxy[13893]: ::ffff:192.168.1.233:50598 [20/Sep/2019:06:12:44.150] ssl_termination/1: SSL handshake failure
Fri Sep 20 06:12:44 2019 local0.info haproxy[13893]: ::ffff:192.168.1.233:50600 [20/Sep/2019:06:12:44.175] ssl_termination/1: SSL handshake failure



haproxy -vv
HA-Proxy version 1.8.20-1 2019/06/27
Copyright 2000-2019 Willy Tarreau <willy@haproxy.org>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = x86_64-openwrt-linux-musl-gcc
  CFLAGS  = -Os -pipe -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -iremap/mnt/data/share/data/software/openwrt/build_dir/target-x86_64_musl/haproxy-ssl/haproxy-1.8.20:haproxy-1.8.20 -Wformat -Werror=format-security -fpic -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -DBUFSIZE=16384 -DMAXREWRITE=1030 -DSYSTEM_MAXCONN=165530
  OPTIONS = USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_GETADDRINFO=1 USE_ZLIB=yes USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_TFO=1

Default settings :
  maxconn = 165530, bufsize = 16384, maxrewrite = 1030, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2s  28 May 2019
Running on OpenSSL version : OpenSSL 1.0.2s  28 May 2019
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.5
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.41 2017-07-05
Running on PCRE version : 8.41 2017-07-05
PCRE library supports JIT : no (libpcre build without JIT?)
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
        [SPOE] spoe
        [COMP] compression
        [TRACE] trace

haproxy.conf
global
	# Log events to a remote syslog server at given address using the
	# specified facility and verbosity level. Multiple log options 
	# are allowed.
	#log 10.0.0.1 daemon info
	log /dev/log local0 debug

	# Specifiy the maximum number of allowed connections.
	maxconn 20480

	# Raise the ulimit for the maximum allowed number of open socket
	# descriptors per process. This is usually at least twice the
	# number of allowed connections (maxconn * 2 + nb_servers + 1) .
	ulimit-n 65535

	# Drop privileges (setuid, setgid), default is "root" on OpenWrt.
	uid 0
	gid 0

	# Perform chroot into the specified directory.
	#chroot /var/run/haproxy/

	# Daemonize on startup
	daemon

	nosplice
	# Enable debugging
	#debug

	# Spawn given number of processes and distribute load among them,
	# used for multi-core environments or to circumvent per-process
	# limits like number of open file descriptors. Default is 1.
	nbproc 2
	nbproc 1
	nbthread 4
	cpu-map auto:1/1-4 0-3

	ssl-default-bind-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
	ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
	tune.ssl.default-dh-param 2048

    # Default parameters
    defaults
    	# Default timeouts
    	timeout connect 3m
    	timeout client 60m # timeout in ms
    	timeout server 60m # timeout in s?
    	timeout http-keep-alive 60m # timeout in s

    	log global
    	mode http
    	option httplog
    	maxconn 3000

    frontend ssl_termination
    	bind :::443 v4v6 ssl crt /mnt/container/datadir/certificate/haproxy/ alpn h2,http/1.1
    	mode http
    	option http-server-close

    	option forwardfor
    	http-request set-header X-Forwarded-Proto https if { ssl_fc }
    	
    	# Serve an internal statistics page on /stats:
    	#stats enable
    	#stats uri /stats

    	# Enable HTTP basic auth for the statistics:
    	#stats realm HA_Stats
    	#stats auth username:password

    	default_backend bk_kopano

    	acl host_app1 hdr(host) -i one.xxx.xxx 
    	acl host_app2 hdr(host) -i two.xxx.xxx

    	use_backend bk_app1 if host_app1
    	use_backend bk_app2 if host_app2

    	default_backend bk_app2

    backend bk_app1
    	#redirect scheme https if !{ ssl_fc }
    	http-response del-header X-Varnish
    	http-response del-header X-Varnish-Cache
    	http-response del-header X-Varnish-Server
    	http-response del-header X-Cache

    	server app1 192.168.1.200:6081 check

    backend bk_app2
    	#redirect scheme https if !{ ssl_fc }
    	server app2 192.168.1.220:80 check

It’s a non-problem.

Especially if you do ssllab tests, you will see tons of handshake failures because that’s the entire point of the ssllab test: simulating old, bogus and obsolete clients and testing invalid combinations as well as trigger error conditions.

If you have an actual problem, then elaborate what this problem is; but handshake failures in the logs are not a problem.

Hi @lukastribus,

My problem is that I use ActiveSync (z-push) behind the reverse proxy. I wanted to investigate, whether the handshake failures impact the heartbeat for ActiveSync:

  1. When I connect directly to the (apache 2.4) webserver that serves the ActiveSync protokoll (bypassing HAProxy) I see variable heartbeats (starting with 4 minutes and then going up to longer intervals)

  2. When I connect using HAProxy before the webserver (which should be the production setting), heartbeat is stuck at 4 min, leading to higher battery and data consumption. This is actually how I found the error by investigating the heartbeat interval.

Both are done using same mobile, same connection and same hardware. It is the identical server in the backend. My only explanation so far is that the SSL handshake failures cause ActiveSync to stick to the shortest heartbeat. Therefore I would like to solve the error to be able to have longer heartbeats.

Thank you,
Alex

This has almost certainly nothing to do with random handshake failures you see in the log.

I suggest to fix some configuration issues, then actually troubleshoot the heartbeat timeout:

  • drop ulimit-n configuration, this is not needed and will only cause misconfiguration
  • drop the double nbproc statement. nbproc 1 nbthread 4 is enough
  • remove option http-server-close, this is probably causing issues especially with long running sessions like activesync
  • review both haproxy and backend logs
  • clarify with the backend application vendor what the reason could be for the heartbeat to stick to 4 minutes

If on the other hand, you want to troubleshoot random SSL handshake captures, than capture the entire SSL session/handshake.

@lukastribus

Thank you so much. Just to clarify: I first talked to the z-push team and they suggested that the handshake failures might be an issue, because the log claimed that the “Connection closed during SSL handshake”:

Fri Sep 20 14:23:09 2019 local0.info haproxy[6563]: ::ffff:192.168.1.233:57054 [20/Sep/2019:14:19:09.094] ssl_termination~ bk_kopano/kopano 0/0/1/240257/240258 200 253 - - ---- 15/15/0/1/0 0/0 "POST /Microsoft-Server-ActiveSync?Cmd=Ping&User=xxx&DeviceId=Ninexxx&DeviceType=Android HTTP/1.1"
Fri Sep 20 14:23:09 2019 local0.info haproxy[6563]: 2401:7400:c802:925e:1c1e:edf0:7343:3f04:58396 [20/Sep/2019:14:23:09.903] ssl_termination/1: Connection closed during SSL handshake
Fri Sep 20 14:23:09 2019 local0.info haproxy[6563]: ::ffff:192.168.1.233:57082 [20/Sep/2019:14:23:09.911] ssl_termination/1: SSL handshake failure
Fri Sep 20 14:23:09 2019 local0.info haproxy[6564]: ::ffff:192.168.1.233:57084 [20/Sep/2019:14:23:09.956] ssl_termination/1: SSL handshake failure
Fri Sep 20 14:23:10 2019 local0.info haproxy[6563]: ::ffff:192.168.1.233:57086 [20/Sep/2019:14:23:09.993] ssl_termination/1: SSL handshake failure
Fri Sep 20 14:23:10 2019 local0.info haproxy[6563]: ::ffff:192.168.1.233:57088 [20/Sep/2019:14:23:10.028] ssl_termination/1: SSL handshake failure
Fri Sep 20 14:23:10 2019 local0.info haproxy[6563]: ::ffff:192.168.1.233:57090 [20/Sep/2019:14:23:10.069] ssl_termination/1: SSL handshake failure
Fri Sep 20 14:23:10 2019 local0.info haproxy[6563]: ::ffff:192.168.1.233:57092 [20/Sep/2019:14:23:10.113] ssl_termination/1: SSL handshake failure

I will try the suggested settings and report back. Thank you for your tips, really appreciated!
Alex

Hi @lukastribus,

I have done the suggested changes to haproxy.conf. However heartbeat is still stuck at 4min. Any other suggestions?

Thank you,
Alex

My suggestion is to provide the full backend logs, full haproxy logs and a capture of a sessions that closes after 4 minutes and analyze that.