Haproxy sometimes sends SSL Handshake Errors

I observe SSL Handshake failures in my Haproxy logs. Sometimes this occurs when the web application is trying to save data or is receiving a larger post and it causes the web application to throw an error also.

Environment notes: We are running RHEL 7 - Using Haproxy to interface two active webbackends servers and one backup webbackend servers We are using Cherrypy for the web application.

Here is an example error in the cherrypy logs:

server02 [4.xxx] [INF] (web/services/web/site-packages/cherrypy/_cplogging.py) - - "GET /~api/entity/tMatch%22%3Atrue%7D%5D HTTP/1.1" 200 41 "https://www.example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"
  File "/web/services/web/site-packages/OpenSSL/SSL.py", line 1545, in _raise_ssl_error
    raise ZeroReturnError()
  File "/web/services/web/site-packages/OpenSSL/SSL.py", line 1734, in recv_into
    self._raise_ssl_error(self._ssl, result)
  File "/web/services/web/site-packages/OpenSSL/SSL.py", line 1545, in _raise_ssl_error
    raise ZeroReturnError()

And here is the corresponding entry in haproxy:

Apr 26 13:27:38 localhost haproxy[2487]: [26/Apr/2021:13:27:38.136] www.example.com:80/2: SSL handshake failure

Here is the Haproxy build info:
HA-Proxy version 2.0.14 2020/04/02 - https://haproxy.org/
Build options :
TARGET = linux-glibc
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits


Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=1).
Built with OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.5
Built with network namespace support.
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with zlib version : 1.2.7
Running on zlib version : 1.2.7
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Encrypted password support via crypt(3): yes

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
              h2 : mode=HTX        side=FE|BE     mux=H2
              h2 : mode=HTTP       side=FE        mux=H2
       <default> : mode=HTX        side=FE|BE     mux=H1
       <default> : mode=TCP|HTTP   side=FE|BE     mux=PASS

Available services : none

Available filters :
	[SPOE] spoe
	[COMP] compression
	[CACHE] cache
	[TRACE] trace

Here is my Haproxy Configuration file:

# Global settings
 # to have these messages end up in /var/log/haproxy.log you will
 # need to:
 # 1) configure syslog to accept network log events.  This is done
#    by adding the '-r' option to the SYSLOGD_OPTIONS in
#    /etc/sysconfig/syslog
# 2) configure local2 events to go to the /var/log/haproxy.log
#   file. A line like the following can be added to
#   /etc/sysconfig/syslog
#    local2.*                       /var/log/haproxy.log
log local2 info
chroot      /var/lib/haproxy
pidfile     /var/run/haproxy.pid
maxconn     4000
user        haproxy
group       haproxy
tune.ssl.default-dh-param 2048
ssl-server-verify required
ssl-default-bind-options  no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
# turn on stats unix socket
stats socket /var/run/haproxy/info.sock mode 660 level user user report group zabbix
stats timeout 2m # Wait up to 2 minutes for input

# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          5m
    timeout server          5m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 300
# main frontend which proxys to the backends
frontend  example.com:80
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/web.pem
    log  global
    option httplog
    http-request set-header Forwarded for=%[src]
    mode http
    option forwardfor
    default_backend             web_servers
# round robin balancing between the various backends
backend web_servers
    balance roundrobin
    mode http
    option httpchk GET /#
    cookie SERVERUSED insert indirect nocache
    redirect scheme https code 301 if !{ ssl_fc }
    redirect prefix https://www.example.com code 301 if { hdr(host) -i old.example.com }
    #List all the web servers below
    timeout queue 10s
    server  backendweb1 check ssl verify required ca-file /etc/ssl/certs/web.pem cookie web1
    server  backendweb2 check ssl verify required ca-file /etc/ssl/certs/web.pem cookie web2
    server  backendweb3 check ssl verify required ca-file /etc/ssl/certs/web.pem cookie web3 backup

listen stats # Define a listen section called "stats"
 bind :::8404 v4v6 ssl crt /etc/ssl/certs/web.pem # Listen on localhost 8404
 mode http
 stats enable  # Enable stats page
 stats realm Haproxy\ Statistics  # Title text for popup window
 stats uri /stats  # Stats URI
 stats refresh 10s
 stats auth xxxxx:xxxxxxxxxxx  #Authentication credential
 stats admin if LOCALHOST

Just a quick note - this is only happening when passing through HAProxy - I can reproduce when doing a larger POST. If I point to the application directly and do a large post, it passes just fine.

More information - it appears to be something around the SSL check.

If remove the SSL reencrypt (SSL termination at Haproxy only), it will work. It also works in TCP mode - but has issues when using the configuration above. We would like to re-encrypt this traffic between Haproxy and the web backends.

Can you post us the config of your webserver? Which server are you using? Does it work if you disable ssl in the backend? How long does your large request take to finish if it can reproduce? ZeroReturnError might also indicate a timeout …

  • We are using Cherrypy web server. - CherryPy-18.6.0-

  • It does work if I disable SSL in the backend and it finishes quickly. I’ve been trying to troubleshoot SSL between the server and HAProxy, but so far do not know how I should.

  • It may be a timeout - I don’t know why it only happens when SSL is re-encrypted on the backends, however.

I guess I have to apologize - I was a bit sloppy in reviewing the information provided. After some thought, I’m afraid that the problem is not necessarily caused by HAPROXY. From my perspective, HAPROXY might just be the deliverer of bad news here - especially if the query works via HAPROXY without SSL in the backend.

You say you can reproduce the error - this will be the key to solve the riddle:
Does the error also occur natively against the backend? Just by sending the request directly without HAPROXY? (Important is that the request also uses SSL).

Otherwise, in my opinion, the next step would be to run TCPDUMP on your backend and look at both handshakes - once with / by HAPROXY - once natively against the app.

Thanks IhrName -

No, when I target the application directly and bypass HAProxy, I do not get the error. It is only when we pass through the proxy…

Just to clarify my above comments, We are decrypting at the frontend, and then reencrypting to the backends. We are not using SSL termination at HAProxy only, but reencrypting the data as it goes between the proxy and the backends too.

Then I think your next step should be to run TCPDUMP on your backend and look at both handshakes - once with / by HAPROXY - once natively against the app. I know this sucks - sorry.

No problem. I’ll get that test done shortly.