Hello,
I have a strange errors occurring only on one specific HTTP endpoint and only with some backends.
I have a Haproxy in front of 4 nginx in front of one python application. Two nginx backends are using https (and are in docker containers, but that should not be relevent ^^) and other two are using http (and not in docker containers).
Configuration :
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon# Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS ssl-default-bind-options no-sslv3
defaults
log global
mode http
option httplog
option dontlognull
option dontlog-normal
option http-ignore-probes
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.httpbackend pc-backend
balance roundrobinserver -1 185.:5000 check ssl verify none weight 25 server -2 34.:5000 check ssl verify none weight 25 server -3 34.:80 check weight 25 server -4 54.:80 check weight 25
frontend http
bind *:80
mode http
option forwardfor
default_backend pc-backendfrontend https
bind *:443 ssl crt /etc/apache2/ssl/.key.pem
mode http
option forwardfordefault_backend pc-backend
listen stats
bind *:8999 ssl crt /etc/apache2/ssl/.key.pem
mode http
stats enable
stats realm Haproxy\ Statistics
stats uri /haproxy_stats
Some specific queries (on /v2/batch) are sometime (not always) failing with the following logs, only on server -1 and -2 (with https). (Image #1)
On nginx side, requests are fine and there are no errors. (Image #2)
I didn’t manage to reproduce those queries (it’s in production and there is too many queries / seconds), but I was able to capture a working and failing transaction on the haproxy machine and on the backend:
I put all images here since I’m limited as a new user: https://imgur.com/a/JyQsi
The only difference seems to be the [FIN, ACK] packet from the backend arriving after (working) or before (error) the [RST, ACK] sent from HAProxy, on HAProxy side.
I don’t really understand why queries are failling, this is the only difference I see. Do someone have any idea ?
Thanks!