Where am I going wrongly with TLS passthrough and SNI filtering

I’m trying to run a configuration where haproxy runs on a VPS and filters urls to different backend servers, passing the TLS through so that it can be terminated at the destination server. Here’s a simplified way of looking at the “signal flow”.

dns → VPS → haproxy sni filtering → rathole → localserver → caddy (for ssl certificates) → paperless-ngx

(The application I’m working with at the moment is paperless-ngx.)

I’ve found that a regular browser allows me to reach and log into paperless-ngx but mobile apps on android can’t log into the server. If I remove haproxy from the chain, those apps work. The failures that the occur are giving me cryptic responses and are very different between the two apps. I’ll have to spend more time gathering logs but I was hoping to get some immediate direction as whether my setup is just wrong from the start or that it should work.

Here is an example of the haprox config file.

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
 daemon
 #user                haproxy
 #group               haproxy
 log                 /dev/log local6 debug
 maxconn             5000
 #chroot              /var/lib/haproxy
 #pidfile             /var/run/haproxy.pid

#---------------------------------------------------------------------
# common defaults 
#---------------------------------------------------------------------
defaults
 mode                 tcp
 log                  global
 option               dontlognull
 timeout connect      5s
 timeout client       10s
 timeout server       10s

#---------------------------------------------------------------------
# dedicated stats page
#---------------------------------------------------------------------
#listen stats
 #mode http
 #bind :22222
 #stats enable
 #stats uri            /haproxy?stats
 #stats realm          Haproxy\ Statistics
 #stats refresh        30s

listen stats
 bind :9000
 mode http
 stats enable
 stats hide-version
 stats realm Haproxy\ Statistics
 stats uri /haproxy_stats


#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------

frontend main_https_listen
 bind *:443
 mode tcp
 acl paper_filter req_ssl_sni -i paper.server.com
 use_backend paper_backend if paper_filter


backend paper_backend
 server paper rathole:443

Yes, as per:

http://docs.haproxy.org/2.6/configuration.html#7.3.5-req_ssl_sni

If content switching is needed, it is recommended to first wait for a complete client hello (type 1), like in the example below.

# Wait for a client hello for at most 5 seconds
tcp-request inspect-delay 5s
tcp-request content accept if { req.ssl_hello_type 1 }

Put that into your frontend main_https_listen

Thank you. I think this may have solved the problem.

Can you explain a bit about what this delay does? I’m not quite grasping what is going wrong without it.

btw, I found that running haproxy on a different VPS (perhaps one with more latency??) changed the behavior a little and I was able to log in from mobile apps but it would take several tries. And once I was logged it, I would occasionally get an error or some other glitch.
Now that I’ve added the tcp-request inspect delay and content accept lines, everything seems to be smooth.

With this configuration, haproxy waits for up to 5 seconds to receive the SSL client hello and when the SSL client hello is received (or the timeout of 5 seconds strikes) only then is the SNI value compared.

Without it, haproxy will try to match the SNI value asap, without waiting for the data to be ready. Sometimes it works, most of the times it will not.

Isn’t the client hello the very first step in the communication? Is the delay so that the haproxy server receives the entire message before doing anything? If it can’t reliably identify the SNI value without receiving the full message, how does it know where to start passing it to? Is it guessing?

With some more reading I realize that there are fresh TLS handshakes as various resources are accessed by the client. And the android clients are probably more inclined to start new sessions than a web client?

Just because we accept() a connection on port 443 does not mean the buffer already contains the entire first ssl message (which is the client hello).

You configure a rule. The rule either matches, or it does not.

Whether it does not match because there is no haystack at all or because the needles in the haystack are all different then the needle you are looking does not change the outcome.

The network, latency, os and tcp stack configuration is probably very different.

But waiting for the ssl client hello is always required when using sni matching in a ssl passthrough configuration.

Well color me surprised. I figured there was some kind of ‘ack’ bit to watch for before applying a rule, especially for a handshake. But I suppose that when you’re trying to push data so quickly, waiting for an ack bit at the end of every payload would add significant latency to everything. This “haystack” situation sounds a bit like the “switch bounce” problem. The solution (for many years) has been what is effectively a delay to let the needles settle.

Remember that you are using TCP mode here. Haproxy doesn’t even know about you using ssl through it.

This is a very low level configuration, which requires low level tweaks.

Which is exactly what this configuration does indeed.