Backend Server Timeouts

phinel · May 16, 2017, 1:37pm

Hello

We use haproxy together with keepalived as an high available loadbalancer

The current versions are:
Linux: Ubuntu 16.04 LTS
haproxy: 1.6.9
keepalived: 1.2.19

We are using haproxy since summer of last year to deploy a http-site to customers. At the end of 2016, the problems connecting to the backend application began and the users are experiencing timeouts.

There are 4 backend servers and haproxy is deploying the users on one of the 4 servers depending on a cookie set by the application.

We’re not sure if the problems are coming from haproxy or the backend application servers. Our problem is, that we can’t reproduce the problem the users are reporting. And there is actually no possibility to let some users connect directly to a backend server without haproxy (vpn limitation).

So i need to be sure that the problem is coming from backend application server.

I’ve now activated deeper logging in haproxy config:

option httplog
option log-separate-errors
option dontlog-normal

now, i have a lot of entries like the following in the logs:
May 16 15:31:29 xxxx haproxy[932]: xx.xx.xx.xx:xxxx [16/May/2017:15:29:21.324] server_80 server_80/server4 7/0/128231/1/128239 200 499 - - --VN 1670/906/485/198/1 0/0 “GET /di… HTTP/1.1”

i am not sure, but the time shown is very strange. It seems, that the backend web server is not giving a response in an acceptable amount of time.

Is this true? Can the problem be isolated at the backend webserver with those log entries?
Or is there an other possibility for our problem?

lukastribus · May 16, 2017, 1:51pm

According to this log, haproxy takes 128 seconds to actually connect to the backend. I suggest you check iptables/conntrack/networking between haproxy and the backend server.

phinel · May 16, 2017, 2:04pm

Thanks for your reply.

there is a lot of communication between haproxy and the backend servers and only some of them are experiencing those time values. Those entries are under 5% of the whole communication, but still more than 1000 a day.

i’m not sure if this is the problem the users are experiencing or if there is another problem…

But i think, the network between haproxy and the backend servers isn’t the problem, because it is very flat (just one 10GBit-switch, no HW-FW,…)

lukastribus · May 16, 2017, 3:26pm

I understand that, but there are still a lot of things to check. Most importantly dmesg output of all boxes, for any conntrack related issues, etc.

phinel · May 17, 2017, 6:14am

The backend webservers are windows machines and the web-application running on that boxes are “blackboxes” for me. So i have to prove that the loadbalancer isn’t the problem.

With dmesg on the loadbalancer i don’t see much special

conntrack is new for me, i don’t know much about this tool and how to run it to get the infos i need. Any suggestion?

Thanks

lukastribus · May 17, 2017, 7:25am

So mention of conntrack in the dmesg output?

If you need to proove that this is the application, I guess there is not much left to do other than permanently sniffing the backend traffic on the haproxy box (maybe use dumpcap with circular buffers), and analyze what happens when you have a reoccurrence of the problem.

Topic		Replies	Views
Strange timeout behavior during load testing Help!	7	1677	October 5, 2017
Debugging connection and session issues - Haproxy TCP Mode Help!	2	11069	March 24, 2016
Haproxy Layer 4 time out Help!	0	547	July 7, 2020
Latency added when backend emits response every few seconds Help!	3	1638	December 15, 2016
HAProxy Statics Help!	1	954	May 8, 2017

Backend Server Timeouts

Related topics