504 response and sR--

Hi,

We stumble on some intermittent 504 responses and we identified a scenario.

Each time, a slow client (mobile) made a >100ko request. Backend response is rather quick ~200ms, but the total time is ~30s.

Example:

Nov 20 16:51:32 cm-prod-haproxy-1-dc2 haproxy[14226]: [the ip]:56118 [20/Nov/2019:16:51:01.864] https-in~ www-xxx/cz-prod-web-3-dc2.xxx 0/0/1/200/30327 200 398636 - - ---- 464/464/40/13/0 0/0 "GET /api/1/android/es/xxx HTTP/1.1"

Then, the next request when done within a few seconds returns a 504 with this weird sR-- state:

Example:

Nov 20 16:51:37 cm-prod-haproxy-1-dc2 haproxy[14226]: [the ip]:56118 [20/Nov/2019:16:51:37.957] https-in~ www-xxx/cz-prod-web-2-dc2.xxx 0/0/0/-1/0 504 214 - - sR-- 366/366/43/12/0 0/0 "GET /api/1/android/es/yyy HTTP/1.1"

On our backend side (nginx), the connection is cut by HAProxy (499 code) :

[the ip] - - [20/Nov/2019:16:51:37 +0100] "GET /api/1/android/es/yyy HTTP/1.1" 499 0 "-" "Xxx/5.24.0 (Android; 28 9)" "[the ip]" 

And on our backend side still (Rails server, unicorn), the request is fully made.

But, is the next request is >40s apart from the first one, everything is fine.

This is really weird. As if something on the first request impacted the second one, a moment after.

I saw this thread, but it is deemed to be fixed (1.7.x in March) : Intermittent 504 errors and sR-- after upgrade to 1.7.10

Our conf is:

defaults
  log global
  maxconn 8000
  mode    http
  retries 3
  timeout client 10s
  timeout connect 5s
  timeout server 30s
  option httplog
  option redispatch
  option http-buffer-request
  balance roundrobin
  no option http-use-htx

And for nginx

    keepalive_timeout 650;
    keepalive_requests 10000;