We are having some troubles to debug a problem with our haproxy and our backend server.
Scenario: haproxy server are a Ubuntu 14.04 LTS. Haproxy was installed using apt-get and are currently in version 1.5.14. Right now, we have only one haproxy server and only one backend server with a php application whose can server 120 connections max. The problem happened when we received a huge traffic and the backend could not server all connections.
In this scenario HAproxy started queue process and timeout the packet in the queue and that’s expected but, after the timeout a empty message was send to the client and we get a lot of erros in the stats page. They showed in the nodes session, backend line. error column, conn column. No other erros are showed to request or response in stats page, in other words, Req and Resp Erros column are 0.
I researched about and did not find anything that explain what is this Conn Erros in stats page, however I think that is related to packets timeout in the queue, is that right?
Our haproxy are in TCP Mode and below are the config file.
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon maxconn 12288 ca-base /etc/ssl/certs crt-base /etc/ssl/private ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS ssl-default-bind-options no-sslv3 defaults log global mode http option httplog option dontlognull option abortonclose option httpclose timeout connect 5000 timeout client 50000 timeout server 50000 errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http listen stats :9090 balance mode http stats enable stats auth edited:edited frontend localnodes bind *:80 option tcplog mode tcp default_backend nodes backend nodes mode tcp balance roundrobin server php-app 10.0.0.1:80 maxconn 120 check
Below are the kernel configuration we did to haproxy:
fs.file-max = 999999 net.ipv4.ip_local_port_range = 1024 65000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_fin_timeout = 10 net.core.somaxconn = 65535 net.ipv4.tcp_max_syn_backlog = 65535 net.ipv4.tcp_max_tw_buckets = 409600
Trying to reproduce this problem and identify if the issue are in haproxy or in the backend server, I create a staging scenario with one haproxy and one server (with a simple html file) and change the haproxy backend configuration to:
backend nodes mode tcp balance roundrobin fullconn 4 server php-app 10.0.0.2:80 maxconn check minconn 1 maxconn 1
I stressed the connection on this server but, even with packet timing out of the queue and a lot of conn errors, I received HTTP 200 ok and the html message from the server. To do this I used looped curl requisitions in a script in 4 servers. Every server made 100 requisitions.
But, when we use a browser (Chrome) to create concurrent connections in this scenario, only one computer received the html message the others received “connection timeout”. Even reloading the page, this behavior remained.
So, we did two tests with different conclusions and we could not realize why.
Then, anyone knows another way or more detailed way to debug what could happened? Everything I saw are about http mode and not apply to my case.
If someone knows what could be will help a lot too.
Thanks and regards,