Server timeout past 60 seconds? (FIXED - php-fpm)

I have a backend with long running php scripts. Anything above 60s results in a 504 error.

HAProxy v1.8.15
defaults
mode http
log global
option httplog
option dontlognull
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 15m
timeout queue 1m
timeout connect 10s
timeout client 65s
timeout server 65s
timeout tunnel 65s
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000

I’ve tried tweaking the timeout client/server/tunnel, starting at 30 seconds and slowly incrementing up to 60 seconds. That works great. Anything above 60 seconds seems to be ignored, and I see a 504 error on the client.

Oct 28 01:48:33 localhost haproxy[34486]: [IP REDACTED]:17279 [28/Oct/2019:01:48:03.929] main~ hc/hc 0/0/2/-1/30005 504 194 - - sH-- 1/1/0/0/0 0/0 “GET /tools/z_scripts/timeout_test.php HTTP/1.1”
Oct 28 01:50:13 localhost haproxy[34610]: [IP REDACTED]:7454 [28/Oct/2019:01:49:28.560] main~ hc/hc 0/0/3/-1/45004 504 194 - - sH-- 3/3/0/0/0 0/0 “GET /tools/z_scripts/timeout_test.php HTTP/1.1”
Oct 28 01:52:22 localhost haproxy[34717]: [IP REDACTED]:3619 [28/Oct/2019:01:51:27.248] main~ hc/hc 0/0/2/-1/55004 504 194 - - sH-- 1/1/0/0/0 0/0 “GET /tools/z_scripts/timeout_test.php HTTP/1.1”
Oct 28 01:54:10 localhost haproxy[34805]: [IP REDACTED]:12123 [28/Oct/2019:01:53:10.755] main~ hc/hc 0/0/3/60059/60062 504 428 - - ---- 1/1/0/1/0 0/0 “GET /tools/z_scripts/timeout_test.php HTTP/1.1”

the first three are from sub-60 second timeout settings. the last one is using 65 seconds.
I’m pulling my hair out here… any ideas?

If you look at the first three lines you’ll notice the sH-- string, it is the termination state at disconnection. These are explained here https://cbonte.github.io/haproxy-dconv/2.0/configuration.html#8.5
s means server timeout hit
H means haproxy was waiting for a response.
For each of these lines you can see more or less the timeout you set for timeout server in ms (30005, 45004, 55004). The last one does not have sH, which means haproxy did not hit server timeout while waiting for a response. The most likely reason is that the 504 was sent by the backend server, when it hit its own timeout.

1 Like

Thanks Jerome. I suspected it was something on the server side, and you were right – it was!

We are running php-fpm, serving various scripts. The php scripts that start outputting data within a few seconds run just fine, even if they ran longer than 60 seconds. Some of them run for over 10 minutes!

The issue only affected scripts that ran longer for a minute before giving any output, for example long-running scripts that spit out JSON.

So, the solution:
There is a config directive called request_terminate_timeout: “The timeout for serving a single request after which the worker process will be killed”. By default, the directive is commented out (unset). The documentation states that the default setting is ‘0’, for unlimited time. So I assumed that since it was commented out, it would default to unlimited – right?

In reality, if unset (e.g. commented out), the default becomes 60 seconds. So while I assumed that php-fpm was allowing unlimited time for the script to complete, it was actually killing it after 60 seconds.

Un-commenting this directive fixed the issue.
request_terminate_timeout = 0