Hello everyone, im having a hard time trying to figure out what is causing this behavior, so far i got nothing, i’ll describe the scenario below:
All services we control have haproxy acting as a reverse proxy, the problem is : i have one service trying to communicate with another and getting 503 as a response from haproxy, they can ping, traceroute etc. The error is when my script lua sends a request to the other service and gets 503 even with servers up.
Server1 → server with script lua, server sending the request, origin
Server2 → server receiving request, destination
These servers are ec2, we control the ACL rules, both can be reached by ICMP or HTTP, they can communicate. As I explained, they also have haproxy in front of containers. In Server1 i have one script lua verifying backends health, if one backend is going down the script lua sends a GET request to an endpoint in Server2, the problem is happening in this request between Server1 and Server2, when the script lua sends the request i get one 503, that is the part i cant figure out why.
Here is the error i get in Server1 :
-:- [07/Oct/2025:18:17:17.487] <HTTPCLIENT> -/- 3/0/-1/-1/1 503 217 - - SC-- 1/0/0/0/3 0/0 {64:ff9b::36cf:a2c7} "GET ``https://url/api/v3/haproxy/``HTTP/1.1"
Looking at the haproxy logs in Server2 the request never arrives, so i dont have any log in the Server2, the backends of server2 are all up, i was able to make curl request from Server1 to Server2 outside and inside the container :
Server1 outside of container/inside of container:
curl -I --location -H "Content-Type: text/html" '``https://url/api/v3/haproxy/
Server2 logs:
ip:5950 [04/Dec/2025:13:09:36.286] http-https-in~ devops/srv-devops 0/0/1/4/5 401 269 - - ---- 14/14/1/1/0 0/0 “HEAD /api/v3/haproxy/ HTTP/1.1”
And this error only occurs sometimes, in general works fine except in these cases. In server1 im using haproxy 2.8.15, Server2 haproxy 2.6
Im aware what the documentation says about this kind of error
SC The server or an equipment between it and haproxy explicitly refused
the TCP connection (the proxy received a TCP RST or an ICMP message
in return). Under some circumstances, it can also be the network
stack telling the proxy that the server is unreachable (e.g. no route,
or no ARP response on local network). When this happens in HTTP mode,
the status code is likely a 502 or 503 here.
But the problem is our EC2 doesn’t have any kind of firewall between or inside the machines even ufw is disable, we only use ACLs, here is the config file of both servers:
server1 :
global
log stdout format raw local0 info
maxconn 200000
user root
group root
lua-load /usr/local/etc/haproxy/script-lua.lua
http-errors custom_errors
errorfile 503 /usr/local/etc/haproxy/errors/503.http
defaults
mode http
log global
log-format '{"host":"%H","time":"%Tl","totalTime":"%Tt","serverTime":"%Tr","client_ip":"%ci","backend":"%b","frontend":"%ft","server":"%s","upload":"%U","download":"%B","statusCode":"%ST","method":"%HM","uri":"%[capture.req.uri,json(utf8s)]","body":"%[capture.req.hdr(0),json(utf8s)]"}'
timeout tunnel 12h
option dontlognull
retries 999
option redispatch
timeout connect 100000
timeout client 200000
timeout server 200000
resolvers docker_resolver
nameserver dns 127.0.0.11:53
parse-resolv-conf
hold valid 15s
hold other 30s
hold refused 30s
hold nx 30s
hold timeout 30s
hold obsolete 30s
resolve_retries 3
timeout retry 1s
timeout resolve 1s
listen stats
bind *:1900
http-request use-service prometheus-exporter if { path /metrics }
stats enable
stats refresh 10s
stats show-node
stats uri /stats
frontend main_ssl
bind *:443 ssl crt /ssl/certkey.pem alpn h2,http/1.1 #https portal
bind *:9994 ssl crt /ssl/certkey.pem alpn h2,http/1.1 #https main
errorfiles custom_errors
http-response return status 503 default-errorfiles if { status 503 }
option http-buffer-request
declare capture request len 10000
http-request capture req.body id 0
acl isPortal dst_port 443
acl isMain dst_port 9994
use_backend main if isMain
use_backend portal if isPortal
default_backend portal
frontend main-front
bind *:80 #http portal
bind *:9090 #http main
errorfiles custom_errors
http-response return status 503 default-errorfiles if { status 503 }
option http-buffer-request
declare capture request len 10000
http-request capture req.body id 0
acl isPortal dst_port 80
acl isMain dst_port 9090
use_backend main if isMain
use_backend portal if isPortal
default_backend portal
backend portal
mode http
compression algo gzip
compression type text/html text/plain text/css application/json
server portal portal:80 check resolvers docker_resolver init-addr last,127.0.0.1
backend main
mode http
compression algo gzip
compression type text/html text/plain text/css application/json
server main-service main-service:80 check resolvers docker_resolver init-addr last,127.0.0.1
Server2:
global
log 127.0.0.1 local0 debug
user root
group root
# Default settings
defaults
log stdout format raw local0 debug
mode http
option httplog
timeout tunnel 12h
option dontlognull
retries 3
option forwardfor
option redispatch
timeout connect 100000
timeout client 300000
timeout server 300000
maxconn 15000
option forwardfor
resolvers docker_resolver
nameserver dns 127.0.0.11:53
parse-resolv-conf
hold valid 15s
hold other 30s
hold refused 30s
hold nx 30s
hold timeout 30s
hold obsolete 30s
resolve_retries 3
timeout retry 1s
timeout resolve 1s
# Front-end setup
frontend http-https-in
bind *:443 ssl crt /certs/cert.pem
bind *:80
use_backend devops if { hdr_dom(host) -i url }
use_backend nodered if { hdr_dom(host) -i url }
default_backend nodered
backend devops
server srv-devops nr-devops:1880 check resolvers docker_resolver init-addr last,127.0.0.1
backend nodered
server srv-nodered node-red:1880 check resolvers docker_resolver init-addr last,127.0.0.1
Obs: if you find any error in the config file is because i had to adapt some configs to not expose the real one, but essencially they are the same
Can someone have any clue why this happen?