Haproxy 2.1.4 too many SSL Handshake failures

Hi we are using haproxy 2.1.4 as SSL terminator between our own client and server machines(High load machines, always busy) and also requests will be a mix of http/1.1 and http/2.0. We are facing lots of SSL handshake failure in front end. I have enabled proxy logs using rsyslog and get following errors,

Aug  5 18:55:35 localhost haproxy[40308]: 127.0.0.1:55442 [05/Aug/2020:18:55:35.364] frontend/1: SSL handshake failure
Aug  5 18:56:20 localhost haproxy[40308]: 204.xx.xx.xx:45474 [05/Aug/2020:18:56:16.761] frontend/1: Connection closed during SSL handshake
Aug  5 18:56:22 localhost haproxy[40308]: 204.xx.xx.xx:52088 [05/Aug/2020:18:56:19.403] frontend/1: Connection closed during SSL handshake
Aug  5 18:56:33 localhost haproxy[40308]: 127.0.0.1:42470 [05/Aug/2020:18:56:33.933] frontend/1: SSL handshake failure
Aug  5 18:56:33 localhost haproxy[40308]: 127.0.0.1:42472 [05/Aug/2020:18:56:33.944] frontend/1: SSL handshake failure

Few of the requests have source Ip as 127.0.0.1 but we are doing a plain text connection between proxy and backend as a proxy is SSL terminator here, I could not get detailed logs out of haproxy, my configurations are as follows,

global
   log         127.0.0.1 local2
   chroot /var/lib/haproxy
   maxconn 200000
   user test
   group testsending
   daemon

tune.ssl.cachesize 200000
#tune.h2.max-concurrent-streams 10
ssl-dh-param-file /etc/haproxy/dhparam.pem

#Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private

#Obtained from https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy

ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11

ssl-default-server-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-server-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-server-options no-sslv3 no-tlsv10 no-tlsv11

defaults
    log     global
    maxconn 20000
    mode    http
    option httplog
    option dontlog-normal
    option logasap
    retries 3
    retry-on all-retryable-errors
    option log-separate-errors
    timeout connect     5s
    timeout client     60s
    timeout server    450s

frontend    frontend_haproxy
     option forwardfor
     capture request header MONITORID len 64
     capture response header MONITORID len 64
     log-format "%ci:%cp\ [%t]\ %f\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ST\ %B\ %CC\ %CS\ %tsc\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ Reqid:%hr\ Resid:%hs\ %{+Q}r\ %sslv\ %sslc"
     bind    *:8088  ssl crt /etc/haproxy/haproxy.pem alpn h2,http/1.1
     default_backend backend_eumagent

 backend     backend_eumagent
     timeout server  420000
     fullconn 2000
     server tomcat localhost:9099 check

and output of haproxy -vv as,

 HA-Proxy version 2.1.4 2020/04/02 - https://haproxy.org/
 Status: stable branch - will stop receiving fixes around Q1 2021.
 Known bugs: http://www.haproxy.org/bugs/bugs-2.1.4.html
 Build options :
   TARGET  = linux-glibc
   CPU     = generic
   CC      = gcc
   CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits
   OPTIONS = USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1 USE_SYSTEMD=1
 
 Feature list : +EPOLL -KQUEUE -MY_EPOLL -MY_SPLICE +NETFILTER +PCRE -PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED -REGPARM -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H -VSYSCALL +GETADDRINFO +OPENSSL -LUA +FUTEX +ACCEPT4 -MY_ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS
 
 Default settings :
   bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
 
 Built with multi-threading support (MAX_THREADS=64, default=6).
 Built with OpenSSL version : OpenSSL 1.1.1c  28 May 2019
 Running on OpenSSL version : OpenSSL 1.1.1c  28 May 2019
 OpenSSL library supports TLS extensions : yes
 OpenSSL library supports SNI : yes
 OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
 Built with network namespace support.
 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
 Built with PCRE version : 8.32 2012-11-30
 Running on PCRE version : 8.32 2012-11-30
 PCRE library supports JIT : no (USE_PCRE_JIT not set)
 Encrypted password support via crypt(3): yes
 Built with zlib version : 1.2.7
 Running on zlib version : 1.2.7
 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
 
 Available polling systems :
       epoll : pref=300,  test result OK
        poll : pref=200,  test result OK
      select : pref=150,  test result OK
 Total: 3 (3 usable), will use epoll.
 
 Available multiplexer protocols :
 (protocols marked as <default>      cannot be specified using 'proto' keyword)
               h2 : mode=HTTP       side=FE|BE     mux=H2
             fcgi : mode=HTTP       side=BE        mux=FCGI
        <default>      : mode=HTTP       side=FE|BE     mux=H1
        <default>      : mode=TCP        side=FE|BE     mux=PASS
 
 Available services : none
 
 Available filters :
 	[SPOE] spoe
 	[CACHE] cache
 	[FCGI] fcgi-app
 	[TRACE] trace
 	[COMP] compression

We send requests to haproxy from apache(http1.1) and jetty(http2) httpclients using java 8 and our backend is an apache tomcat 9.0.30+ running using java 11. All our machines are centos 7.x versions.

Kindly help me to debug this issue. Thanks in advance

Do you have real problems or are you just concerned about those log messages?

If you have real problems, please share the informations you have about those failure scenarios. If you are concerned about the log messages only and don’t know about any real impact, this is quite normal for an Internet facing service. People that are hitting ESC in their browser or mobile connections dropping out can all cause such failures, just as SSL testing and scanning as well as your own valid health checks.

Hey Lukas,

We have performance issue when making http2 requests to these machines, there are no external users for these machines all the requests are made from our own schedulers using java http clients. We could not isolate where the exact problem, so started debugging haproxy which is in middle of our client and server as SSL Terminator. We see lots of ssl failures in haproxy and lots of timeouts, ashnchronous closed exception, channel closed and other http2 related exceptions from client. Http2 is only made to haproxy which terminates ssl And sends a plain text to backend server. I could not expose debug logs in daemon mode all I could get is those error logs I shared. Guide me to get detailed verbose logs from haproxy. I assumed due to too many errors we could not maintain http2 channel open from our client side which had an impact in httpclient on the client side. Correct me if my assumption is wrong. Or let me know what kind of details I can share with you or the right way to get the detailed logs

Thanks

I suggest you try without H2 (removing h2 from alpn), and retry checking the impact of that change regarding a) your actual performance problem and b) the SSL failures.

I’d also suggest to upgrade to recent bugfix releases.

You mean to remove h2 config from cfg file and check whether these problems arises in http1.1? I think these are happening in machines which don’t have incoming http2 requests also.

Yes, that’s what I mean.

Ya will give a try,

but right now h2 requests are disabled and only http1.1 requests are made to haproxy still I see those SSL failures.

Is there a way to get verbose logs of those ssl failures?

Good, now continue to monitor and see if you still have those performance problems.

No, there is not.

Removed h2 alpn in haproxy.cfg and restarted and still faced SSL failures for normal http1.1 requests. As far http1.1 there is no performance issue because each request is a new tcp connection. so if ssl failures occured it only affected that single request. But in h2 it is having impact and we see bulk failures I don’t know the exact reason. It may be because of multiplexing but ssl handshake should happen before streaming.

Having diff versions of openssl in client and haproxy might create any problem?

Then H1 should have MORE problems, not LESS, because in HTTP1 you will have a lot more SSL handshakes.

In H2 you will have more transactions on a single connection, so you are LESS susceptible to SSL issues.

This really confirms that one thing has nothing to do with the other, and you need to approach your real problem (H2 performance issues) from that perspective, not going down the rabbit hole of SSL issues.

First of all I’d suggest to upgrade haproxy to latest release 2.1.8.

Yes I agree H1 might have more problems but its 4 or 5 SSL issues per minute with means out of 1000s of H1 requests 4 or 5 is negligible for us. but In the case of H2, even those 4 or 5 means “n” number of requests streamed in a single connection which all might fail. Also note many SSL handshakes fails from 127.0.0.1(we don’t have https backend, so it might be from proxy to client at some point in the transaction).

I feel it has nothing to do with haproxy version because we upgraded a few weeks back.

Thanks Lukastribus for your help and assistance, will try upgrading haproxy and revert the status.