Overview:
I managed to successfully setup an HAproxy installation for use as a reverse proxy and later load balancer. Technically everything is working but the pages loaded through the proxy are extremely slow (like multiple minutes for a simple Wordpress site).
My setup:
HAproxy on FreeBSD 11 64-bit. It’s a root server with a 4-core Xeon 3.3 GHz, 32 GB memory and 1G/1G internet connection
Different webservers running FreeBSD 11 64-bit. Those are usually machines with two to four cores and 8 to 16 GB of memory and 1G/1G internet connection.
The servers are not physically at the same location. I use OpenVPN to tie them into a private network. The ping between the HAproxy and the web servers are around 20 ms stable.
OpenVPN runs in UDP mode. Everything is pretty much default config.
All involved servers have tons of free resources left and are not busy at all. The HAproxy server isn’t doing anything other than running HAproxy and acting as the OpenVPN server.
My problem:
I tried to reverse-proxy three different existing websites through the new HAproxy machine. When I access the website through the web servers public IP they load within less than a second. When I load then through the HAproxy machine they take up to 11 minutes to complete loading.
Here’s an example of a Wordpress side being loaded through HAproxy:
I tried to wget the style.css?ver=4.2 as per your suggestion. It starts of with 20kB/s and then drops down to 100B/s within three seconds. The ETA increases to almost 60 minutes.
I couldn’t start HAproxy with the mss 1200 setting in the backend. Apparently it’s an unknown keyword. I assume that’s the “not supported by FreeBSD” thing you were talking about.
After that I used the --tun-mtu 1500 --fragment 1300 --mssfix settings for OpenVPN as you suggested but nothing changed. (I did ensure that the settings were applied and restarted all OpenVPN instances).
One more thing I did was trying everything outside of OpenVPN. What I mean by that is that I use the web server’s public IP address in the HAproxy backend config instead of the VPN IP. The result is exactly the same -> It takes forever to load.
The root cause is your switzerland server embedded.simulton.com, which is serving those CSS and javascript files terribly slow, from multiple locations:
Well, embedded.simulton.com points to the HAproxy. The reason you got a timeout is because I was working on it. It’s a Wordpress site so it’s not easy to give you access to both the direct line and through the HAproxy as the Wordpress site relies on the base URL.
The HAproxy is in Switzerland. The webserver is in Holland. The same webserver also serves other websites such as fwtools.embedded.pro (yet another Wordpress).
No, I don’t have the problem at all without HAproxy. If I remove the HAproxy and change the DNS record to the web server’s public IP everything loads in less than a second. It’s the same with the Jenkins dashboard and any other web site I tried to route through the HAproxy.
Then there was some kind of misunderstanding here.
When I requested the wget output I meant: do it on the haproxy server, but NOT THROUGH haproxy itself. Instead, from same system query the backend directly. This way we are testing from on datacenter to the other, through OpenVPN but without haproxy.
Also please provide:
the output of haproxy -vv
confirm that system and userspace CPU load
logs: the few lines you provided show request with expected values (below 100 ms); do you see any logs with spikes in those numbers?
Here’s the result when I use curl -O to download said file. First over web server’s public IP and then over VPN. They both download instantly (this is being executed on the HAproxy machine):
root@hydrogen:~ # wget "84.22.111.55/wp-content/themes/rtpanel/style.css?ver=4.2"
--2018-02-28 21:07:53-- http://84.22.111.55/wp-content/themes/rtpanel/style.css?ver=4.2
Connecting to 84.22.111.55:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/css]
Saving to: 'style.css?ver=4.2.3'
style.css?ver=4.2.3 [ <=> ] 30.89K --.-KB/s in 0.02s
2018-02-28 21:07:54 (1.32 MB/s) - 'style.css?ver=4.2.3' saved [199201]
root@hydrogen:~ # wget "10.8.0.18/wp-content/themes/rtpanel/style.css?ver=4.2"
--2018-02-28 21:08:19-- http://10.8.0.18/wp-content/themes/rtpanel/style.css?ver=4.2
Connecting to 10.8.0.18:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/css]
Saving to: 'style.css?ver=4.2.4'
style.css?ver=4.2.4 [ <=> ] 30.89K --.-KB/s in 0.04s
2018-02-28 21:08:19 (692 KB/s) - 'style.css?ver=4.2.4' saved [199201]
Everything seems in order there.
Here’s haproxy -vv:
root@hydrogen:~ # haproxy -vv
HA-Proxy version 1.7.9 2017/08/18
Copyright 2000-2017 Willy Tarreau <willy@haproxy.org>
Build options :
TARGET = freebsd
CPU = generic
CC = cc
CFLAGS = -O2 -pipe -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -DFREEBSD_PORTS
OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_ACCEPT4=1 USE_REGPARM=1 USE_OPENSSL=1 USE_STATIC_PCRE=1 USE_PCRE_JIT=1
Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.40 2017-01-11
Running on PCRE version : 8.40 2017-01-11
PCRE library supports JIT : yes
Built without Lua support
Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY
Available polling systems :
kqueue : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use kqueue.
Available filters :
[SPOE] spoe
[TRACE] trace
[COMP] compression
I can confirm that the CPU usage on the HAproxy machine is never above 0.5% (even when there are running requests) and there are 30 GB of memory free.
I scrolled through the logs (through a lot) and I never found any values higher than what we’ve seen so far. No spikes.
Indeed, everything seems alright with those outputs.
Let’s try some simple configuration changes to see if we can get the behavior to change (for the better or the worse):
put nokqueue into the global section
put bothnokqueue and nopoll into the global section
change timeouts, use timeout client 15s and timeout server 30s
Make sure haproxy is properly restarted (no old haproces process still runs in the background with an old configuration).
If there is no effect with those changes, revert them. We are gonna need the big guns at this point:
since you are on FreeBSD 64bit, strace is not an option. Install truss and attached it to running haproxy process (truss -dfp <PID>), and try to convert the truss output to the actual timestamp (unfortunately truss lacks this very obvious feature)
capture the frontend haproxy connection, something like tcpdump -pns0 -w frontend.cap host 127.0.0.1 and tcp port 80
capture the backend haproxy connection, something like tcpdump -pns0 -w backend-10.8.0.18.cap host 10.8.0.18 and tcp port 80
capture the haproxies log output
capture the http log of your nginx backend instance
Then make a single wget requests through haproxy, and make sure you got all 5 outputs (truss output, frontend capture, backend capture, haproxy logs, backend servers logs).
Both systems should be NTP synched, so that we can compare the logs on the haproxy box with those from the backend.
What I did in the meantime is setting up a FreeBSD 11 on a VPS with 1G/1G internet connection. I copied the exact same HAproxy configuration file and everything works extremely well (even over OpenVPN).
I currently suspect that there’s a network problem with the HAproxy machine. However, I’m not yet sure how to determine the problem. Even if I SCP files from it to another machine it starts going down in speed with larger files (over like 1MB) and at very large files the speed drops to < 4kb/s and SCP reports - stalled -.
I’d appreciate any help on how to debug this.
The next thing I’ll test is not having the HAproxy as the OpenVPN server. With the VPS that I setup for testing the HAproxy instance was an OpenVPN client, not the server.
joel you have no ideas how helpful your thread is for me! i was going crazy upon this EXTREMELY slow page load problem. i have not yet tried your advices but i will. can i ask you questions in case somethign would go wrong? thanks again
Everything is working well now. There was a networking issue as suggested by @lukastribus. The problem was the firewall/NAT configuration. This wasn’t HAproxy related at all.
Thank you very much for your help - much appreciated!