We have installed an haproxy server in front of our web service and API. It decrypts the requests (TLS), parses the headers, then forwards to either our internal servers or our AWS cloud service depending on request parameters (both also TLS).
The service handles about 30 million requests a day. Almost all work fine, but we have a problem with a small number of requests from (ancient) Windows Mobile handhelds. The users of the devices frequently experience problems connecting to our API together with this device-level error message: “Unable to read data from the transport connection”
HaProxy doesn’t log any errors when this problem happens. Due to the handhelds being in client premises we’re also unable to determine whether the bad requests are logged by haproxy as successful as they are surrounded by many genuinely successful requests.
There are no network-level errors reported by the network interface.
Removing haproxy from the equation stops the problem. Putting it back starts the problem again. But I’m at a complete loss as to what might be happening. If anybody has seen anything like this then I’d be very glad to hear from you.
Some customers tell me that this problem happens if their device is idle for more than a couple of minutes between uses. I wonder if there’s some keepalive timeout not respected by haproxy that impacts Windows Mobile (but is handled ok by more modern systems).
We compilled this haproxy ourselves as we needed to support SSlv3 (yes I know about the security issues, but we have no choice but to support clients who won’t countenance changing their 500 in-factory devices just because we tell them that they should).
Output from haproxy -vv
HA-Proxy version 1.8.19 2019/02/11 Copyright 2000-2019 Willy Tarreau <email@example.com> Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-format-truncation -Wno-null-dereference -Wno-unused-label OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_SYSTEMD=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_NS=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with OpenSSL version : OpenSSL 1.0.2r 26 Feb 2019 Running on OpenSSL version : OpenSSL 1.0.2r 26 Feb 2019 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Encrypted password support via crypt(3): yes Built with multi-threading support. Built with PCRE version : 8.39 2016-06-14 Running on PCRE version : 8.39 2016-06-14 PCRE library supports JIT : yes Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available filters : [SPOE] spoe [COMP] compression [TRACE] trace
Our haproxy.conf (redacted)
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners stats timeout 30s user haproxy group haproxy maxconn 4000 daemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS ssl-default-bind-options ssl-min-ver SSLv3 defaults log global mode http option httplog option dontlognull option http-buffer-request # Needed to inspect the POST request parameters (CA/WMS) timeout connect 5000 timeout client 50000 timeout server 600000 errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http resolvers mydns nameserver dns1 184.108.40.206:53 nameserver dns2 10.200.0.101:53 nameserver dns3 10.101.1.11:53 resolve_retries 3 timeout resolve 1s timeout retry 1s hold other 30s hold refused 30s hold nx 30s hold timeout 30s hold valid 10s hold obsolete 30s listen stats bind 0.0.0.0:8000 mode http log global maxconn 10 clitimeout 100s srvtimeout 100s contimeout 100s timeout queue 100s stats enable stats hide-version stats refresh 30s stats show-node stats auth xxx:xxx stats uri /haproxy?stats # # We listen on a single front-end that is bound to both ports 80 and 443. The 443 bind applies the secure certificate. # # When a request is received, these parts of the request are scanned: # - The URL path # - The REFERER header (needed for some anonymous assets that are loaded at login time) # - The request body (for POST requests) # # If any one of these contains any of the client id strings present in the file "/etc/haproxy/aws_clients" # then a match is found and the AWS back-end will be used. # Otherwise the in-house back-end will be used - either secure or insecure depending on the request. # frontend wms maxconn 4000 # Looks for the client id in the parameters acl aws_path urlp_sub -i -f /etc/haproxy/aws_clients # Looks for the client id in the URL path acl aws_path path_sub -i -f /etc/haproxy/aws_clients # Looks for the client id in the Referer header acl aws_referer req.hdr(Referer) -i -m sub -f /etc/haproxy/aws_clients # Looks for the client id in the Cookie header acl aws_cookie req.hdr(Cookie) -i -m sub -f /etc/haproxy/aws_clients # Looks for the client id in the POST body. acl aws_param req.body -i -m sub -f /etc/haproxy/aws_clients # Determines whether this is a secure or insecure request acl is_ssl dst_port eq 443 bind *:80 bind *:443 ssl crt /etc/ssl/haproxy_pvx.pem mode http # wms_cloud back-end is used if any one of the match criteria is met use_backend wms_cloud if aws_path or aws_referer or aws_cookie or aws_param # Otherwise use ovh either secure or insecure use_backend wms_ovh_ssl if is_ssl default_backend wms_ovh # Back-end definitions - cloud, ovh insecure, ovh secure backend wms_cloud mode http option httpchk server wmscloud1 cloud.ourservice.net:443 check resolvers mydns ssl verify none backend wms_ovh mode http server wms1 10.200.0.110:80 backend wms_ovh_ssl fullconn 4000 mode http server wms1 10.200.0.110:443 ssl verify none maxconn 4000