Haproxy / Iptables / VPS : works some minutes then --> Error 503

Hi,

I am trying to use HAProxy on a VPS (debian 8). Iptables is used as firewall.
I can’t modify /etc/sysctl.conf.
My application uses websockets and redis to share data among both backend servers.

When I connect to the the VPS, the web site works some minutes, then Error : 503
There is no reported error my application logs.

Am I doing something wrong ? Do I have to renounce to HAPproxy or the VPS to modify sysctl ?

Extract of HAProxy log files and conf files follow.

Many thanks for your help.

Matthieu

(IP address modified with xxx)

Extract of the Log file :

Jul 10 18:42:35 matthieu haproxy[558]: 149.91.89.xxx:46024 [10/Jul/2017:18:42:30.036] https_app~ http_app/server_app_2 0/4991/2/7/5007 400 228 - - --NI 690/690/687/100/0 0/466 "GET /socket.io/?EIO=3&tr$
Jul 10 18:42:35 matthieu haproxy[558]: 149.91.89.xxx:46008 [10/Jul/2017:18:42:30.036] https_app~ http_app/<NOSRV> 0/5007/-1/-1/5009 503 213 - - sQNN 689/689/687/0/0 0/472 "GET /socket.io/?EIO=3&transpo$
Jul 10 18:42:35 matthieu haproxy[558]: 149.91.89.xxx:46034 [10/Jul/2017:18:42:30.036] https_app~ http_app/<NOSRV> 0/5007/-1/-1/5009 503 213 - - sQNN 688/688/686/0/0 0/471 "POST /socket.io/?EIO=3&transp$

Config file :

global
            log /dev/log    local0
            log /dev/log    local1 notice
    #      log 127.0.0.1:8008 local0
            chroot /var/lib/haproxy
            stats socket /run/haproxy/admin.sock mode 660 level admin
            stats timeout 30s
            user haproxy
            group haproxy
            daemon
            maxconn 10000
            debug

        # Default ciphers to use on SSL-enabled listening sockets.
        # For more information, see ciphers(1SSL). This list is from:
        #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/

        ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
        ssl-default-bind-options no-sslv3

        ssl-default-server-options no-sslv3
        ssl-default-server-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS

defaults http
        log             global
        mode            http
        option          httplog
        option          dontlognull
        retries         3
        option          redispatch
        option          http-server-close
#       option          forceclose
        option          forwardfor except 127.0.0.1
        timeout         connect 5s
        timeout         client 30s
        timeout         client-fin 30s
        timeout         tunnel 1h
        timeout         server 30s

#       default-server inter 1s rise 2 fall 1 on-marked-down shutdown-sessions
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http

#front-end
frontend https_app
        bind 0.0.0.0:443 ssl no-sslv3 crt /etc/ssl/letsencrypt
        default_backend  http_app

#back-end
backend http_app
        option httpchk HEAD /health
        http-check expect status 200
        http-request add-header X-Forwarded-Proto https if { ssl_fc }
        http-request set-header X-Forwarded-Port %[dst_port]
        balance roundrobin
        cookie SERVERID insert indirect nocache
        server server_app_1  127.0.0.1:3001 maxconn 100 check cookie server_app_1
        server server_app_2  127.0.0.1:3002 maxconn 100 check cookie server_app_2

What exact haproxy release is this?
Please provide the output of “haproxy -vv”

The “sQ” log line means a queue timeout, so this is about saturating the “maxconn 100” on your server configuration line.
Raise the maxconn value (or timeout queue) or troubleshoot why your queues are so full.

This is about queuing in haproxy not the systems, therefor sysctls are irrelevant for this problem.

I am using HAProxy 1.7.6.
Only tests were carried out with 2 sessions connected… so no load at all.
I will troubleshoot why queues are so full.
Many thanks for your help.

Hi,

The queue problem is due to many “automatic connections” to my application once HAProxy started (via sudo service haproxy restart). These connections happen even if no browser is connected to https://www.my-domain.com

I see a new connection on the server console approximately each second :

a user connected
a user connected
a user connected

and as many “user disconnected” when HAProxy is stopped.

I did several tests has restrict socket.io origins and changing HAProxy parameters, unsuccessfully.
I do not have such behaviour with the pure node based application.

Here are the extracts of code :

 // client side
    const socket = io('https://www.my-domain.com')

// server side @ https://www.my-domain.com
socketServer.on('connection', function(socket){
    console.log('a user connected')
    socket.on('disconnect', function(){console.log('user disconnected')})
  })

Matthieu

Can confirm you are not using compression in haproxy (not through haproxy, I mean haproxy actively compresses content - the “compression” keyword needs to appear in the config for this)? If yes, than a bug could be the issue, which is fixed in 1.7.8.

If you don’t have “compression” in your config, then that’s not it.

Can you provide the log output when haproxy closes, I would like the see the timings of requests.

Well you do have health checks configured, so that is the reason haproxy continually makes requests to your server, but that should not affect maxconn at all. You can try disabling health checks?

Many thanks for your reply.

I am not using compression.
My hoster contacted me this morning because 50000 conntrack were on my VPS.

I have commented health checks and load balance, only the first server is not commented

# option httpchk HEAD /health
# http-check expect status 200
server server_app_1  127.0.0.1:3001 maxconn 100   # check cookie server_app_1

The multiple connections are still present.

Here are the HAProxy logs once stopped

     Jul 11 18:11:37 matthieu haproxy[455]: 149.91.89.xxx:42526 [11/Jul/2017:18:11:37.164] https_app~ http_app/server_app_1 0/0/0/93/93 200 225 - - --NI 12/12/12/12/0 0/0 "GET /socket.io/?$
    Jul 11 18:11:39 matthieu haproxy[455]: 149.91.89.xxx:42572 [11/Jul/2017:18:11:39.190] https_app~ http_app/server_app_1 0/0/1/1/2 200 324 - - --NI 12/12/12/12/0 0/0 "GET /socket.io/?EI$
    Jul 11 18:11:39 matthieu haproxy[455]: 149.91.89.xx:42578 [11/Jul/2017:18:11:39.211] https_app~ http_app/server_app_1 0/0/0/6/6 200 226 - - --NI 13/13/13/13/0 0/0 "GET /socket.io/?EI$
    Jul 11 18:11:39 matthieu haproxy[455]: 149.91.89.xxx:42584 [11/Jul/2017:18:11:39.228] https_app~ http_app/server_app_1 0/0/0/90/90 200 225 - - --NI 13/13/13/13/0 0/0 "GET /socket.io/?$
    Jul 11 18:11:41 matthieu haproxy[455]: 149.91.89.xxx:42666 [11/Jul/2017:18:11:41.252] https_app~ http_app/server_app_1 0/0/1/1/2 200 324 - - --NI 13/13/13/13/0 0/0 "GET /socket.io/?EI$
    Jul 11 18:11:41 matthieu haproxy[455]: 149.91.89.xxx:42672 [11/Jul/2017:18:11:41.269] https_app~ http_app/server_app_1 0/0/0/6/6 200 226 - - --NI 14/14/14/14/0 0/0 "GET /socket.io/?EI$
    Jul 11 18:11:41 matthieu haproxy-systemd-wrapper[451]: haproxy-systemd-wrapper: SIGTERM -> 455.
    Jul 11 18:11:41 matthieu haproxy-systemd-wrapper[451]: haproxy-systemd-wrapper: exit, haproxy RC=0

Do you have all the required information ?

No idea, we will have to take a look at the actual traffic.

Capture backend traffic on haproxy with tcpdump (-i lo -ps0 -w backend-traffic.cap tcp port 3001 or tcp port 3002).

Are all these socket.io requests "GET /socket.io/?$ normal ?
Its seems they are identified as client connections to the server.

Do you see any interest in desabling conntrack in Iptables ?

I will try to capture the backend traffic with tcpdump.

Many thanks four your help

Those are requests from actual clients, and you can see the source IP in the log file. It is not haproxy that generates those requests.

I apologize, I am not sure to understand you comment :

  • no browser was opened while acquiring the log file above mentioned. It should have had no client
  • the IP address 149.91.89.xxx is the server one
  • I thought the use of a server port such as :42672 was required by the reverse proxy to make the relay between fronted end backend.

Am I wrong ?

If that IP is the server, then it is the server that keeps connecting to port 443. Maybe you have some local health check of port 443?

I don’t understand the question. 42672 is the source port of request, 149.91.89.xxx is the source IP.

Many thanks Lukas.

I don't understand the question. 42672 is the source port of request, 149.91.89.xxx is the source IP.

I just wanted to say that to my mind, request coming from high port numbers of the server, such as :42672, could have been done via HAProxy.

I did tests without health check, at least I try to, commenting the following lines of the HAProxy config file.

# option httpchk HEAD /health
# http-check expect status 200
server server_app_1  127.0.0.1:3001 maxconn 100   # check cookie server_app_1

I was still having the websocket connection event fired (the frequency of new connections is around 1 Hz).
I will do these tests once again to double check

Except HAProxy, do you see any service coming from Iptables or the firewall of the VPS, that could be responsible of these health checks ?

Could be haproxy, if you have configured haproxy to connect to itself (but it doesn’t seem like it from the configuration you provided here).
Are there additional sections in your haproxy configuration that you did not share here?

This is a user space application on your VPS that connects to 443. While haproxy is running an connections are queued up, use the following command to print out all connections along with its application:
sudo netstat -tnp

I did not hide any configuration parameters :slight_smile:

Thanks to your help Lukas, the orgin of the problem is identified :

It comes from a plugins attached to the Javascript framework I am using, that is called every second as soon as HAProxy and my application are both started.
The plugins is in charge of socket.io handshakes so it creates a new user every second and saturates the server after some minutes.

I will give some more information when the problem is solved

I have modified my application so as to avoid the use of plugins. This problem is solved !
Many thanks.