HAProxy 1.5.18 Centos 6 CPU Issue

Since installing Kernel 2.6.32-696.18.7.el6 (Spectre / Meltdown mitigation) on our Centos 6 HAProxy box, the CPU has gone from around 20% to 100%.

The box is dedicated to running HAProxy (along with keepalived) and is supporting a relatively low 3-400 conns/s at around 40mbps. Loadbalancing a handful of websites with SSL termination at HAProxy. The CPU usage starts of fairly normal but within a couple of hours is just stuck at 100%.

Although not the latest and greatest, the box is a reasonable spec Dell PE1950 III, with 2 x Xeon x5460 quad core processors at 3.16ghz, 16gb of ram and Intel gb nics.

This has been running flawlessly for a year or so prior to installing 2.6.32-696.18.7

Any other experiences / advice would be greatly appreciated, for now I’ll rollback the Kernel.

Are you absolutely sure that the increased CPU load is not just only about the initial SSL handshakes that need to negotiated? When you downgraded the kernel, does the CPU not spike at 100% initially?

Does the CPU load come from user space or kernel?

I’m not really sure how I would tell? Pre-upgrade there were occasional spikes in cpu, but generally this was fairly low (below 20%). Now, the HAProxy process is just stuck at 100%

The majority of the cpu load is coming from user space.

Thanks
James

If its stays at 100% and does not decrease to normal values after some time (tens of minutes), then it’s not the initial SSL negotiation.

Are you able to upgrade haproxy to 1.7.10 to make sure this new situation isn’t just triggering an dormant bug in haproxy?

Can you provide the output of “strace -tt” on haproxy, while in the 100% cpu load situation?

No, its more like it starts low then ramps up to 100% after a couple of hours.

I’ll try upgrading to 1.7.10 on our QA box first and then see about upgrading to 1.7.10 in production. I was sticking to 1.5 as that’s what was in the repository for CentOS.

I can attach to the process but obviously the output will be huge and possibly contain sensitive data? Is there something I can do with it?

I tried strace -cp:

strace -cp 4124
Process 4124 attached
^CProcess 4124 detached

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
41.27 0.006073 0 253351 epoll_wait
14.82 0.002181 0 11840 206 write
11.29 0.001662 0 9411 1944 sendto
8.31 0.001223 0 29402 9043 read
5.74 0.000845 0 3505 3452 connect
5.72 0.000842 0 11992 3710 recvfrom
3.59 0.000529 0 5780 close
2.33 0.000343 0 10299 epoll_ctl
1.86 0.000273 0 3424 socket
1.30 0.000192 0 6682 setsockopt
1.28 0.000189 0 4385 2020 accept4
0.99 0.000145 0 2299 6 shutdown
0.60 0.000089 0 1849 getsockname
0.58 0.000085 0 2426 getsockopt
0.31 0.000045 0 3424 fcntl
------ ----------- ----------- --------- --------- ----------------
100.00 0.014716 360069 20381 total

If this helps?

This could simply be a consequence of the patch you installed. Consider yourself lucky your system still boots! Haven’t you read about the Meltdown/Spectre patches before installing them?

I did, and I did install it on our QA and Dev environments first (which are nearly identical) to make sure the system was still operational. This does seem to be load related and, as you say, more than likely a consequence of the upgraded kernel.

Thanks

Meltdown/Spectre patches don’t make CPU load increase from 20 to 100%. To 30% maybe, but not to 100%. Either the Redhat 2.6.32 Meltdown backport is buggy, we are triggering a haproxy bug or the load simply changed.

Thanks Lukas, I can confirm the load (from an external point of view) hasn’t changed.

I have seen other issues relating to HAProxy mentioning high epoll_wait usage, so I wonder if this is relevant?

I don’t think so, high epoll usage is expected. Try the “noepoll” configuration directive to use a different code path, but epoll is the most efficient one for linux.

If you see lower CPU usage with noepoll, that would be indicative of a haproxy bug.

Thanks, I will try this and report back

I don’t know how long your strace ran, but you can clearly see the 100% cpu could not come from system. But you did make 360K x 2 switches between user and kernel space, and it’s those switches that are penalized by the patches. YMMV, as they say.

I ran the strace for about 10 seconds.

Re-ran it for a minute

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
36.56 0.028022 0 1062162 epoll_wait
16.75 0.012841 0 65988 931 write
11.10 0.008504 0 50267 9032 sendto
7.34 0.005624 0 159514 48444 read
6.27 0.004806 0 31518 close
5.89 0.004511 0 66079 20134 recvfrom
4.31 0.003301 0 19345 19050 connect
2.96 0.002271 0 52184 epoll_ctl
2.29 0.001755 0 18928 socket
2.01 0.001544 0 22775 10241 accept4
1.66 0.001271 0 12503 25 shutdown
1.17 0.000894 0 36565 setsockopt
0.68 0.000519 0 18928 fcntl
0.67 0.000511 0 13011 getsockopt
0.35 0.000270 0 10026 getsockname
------ ----------- ----------- --------- --------- ----------------
100.00 0.076644 1639793 107857 total

I turned the noepoll option on. CPU usage seems worse than usual from the start, but will see what it’s like in a couple of hours

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.39 3.293753 89 36864 poll
0.13 0.004175 0 20488 3083 sendto
0.12 0.004077 0 19606 446 write
0.09 0.003060 0 64347 18843 read
0.07 0.002219 0 13458 close
0.05 0.001588 0 8183 8025 connect
0.04 0.001286 0 20659 8632 recvfrom
0.03 0.000929 0 21920 setsockopt
0.02 0.000796 0 7960 socket
0.02 0.000736 0 10053 4486 accept4
0.02 0.000646 0 5509 11 shutdown
0.01 0.000244 0 7960 fcntl
0.01 0.000205 0 5692 getsockopt
0.00 0.000113 0 4228 getsockname
0.00 0.000000 0 37 brk
0.00 0.000000 0 1 restart_syscall
------ ----------- ----------- --------- --------- ----------------
100.00 3.313827 246965 43526 total

After a couple of hours, while this has reduced the CPU load a little (now around 98%) this hasn’t had much real impact.

I’ve switched the load over to the failover unit, which is sitting at around 20% CPU, like the primary was pre-update.

Below is the strace for 1 minute on the failover unit (identical spec and config)

% time    		 seconds 		 usecs/call 		    calls		   errors		syscall
------		 ----------- 		----------- 		---------		---------		----------------
 84.82		    0.163654 		          2 		    70863		         		epoll_wait
  3.35		    0.006461 		          0 		    37519		      761		write
  3.23		    0.006225 		          0 		    31749		     7790		sendto
  1.72		    0.003314 		          0 		    17878		         		close
  1.69		    0.003256 		          0 		    84927		    25444		read
  1.68		    0.003242 		          0 		    40658		    12424		recvfrom
  0.95		    0.001833 		          0 		    11529		    11238		connect
  0.66		    0.001271 		          0 		    35572		         		epoll_ctl
  0.56		    0.001090 		          0 		    13065		     6127		accept4
  0.41		    0.000783 		          0 		    11032		         		socket
  0.34		    0.000647 		          0 		    21296		         		setsockopt
  0.20		    0.000387 		          0 		     6586		        3		shutdown
  0.18		    0.000347 		          0 		    11032		         		fcntl
  0.14		    0.000261 		          0 		     8442		         		getsockopt
  0.09		    0.000174 		          0 		     5277		         		getsockname
  0.00		    0.000000 		          0 		       21		         		brk
------		 ----------- 		----------- 		---------		---------		----------------
100.00		    0.192945 		            		   407446		    63787		total

Doesn’t seem like an epoll issue within haproxy then (noepoll performance is worse). It’s possible that the meltdown mitigation causes some issue in the kernel with polling overall, but really I don’t have an idea at this point.

If you could make that jump to 1.7.10 that would be really helpful to exclude bugs in that old 1.5.18 release.

If 1.7.10 has the same issue, than we will have to take a deeper look at that strace, not a summary but the actual output of “strace -tt -p”. If you are concerned about posting it publically (which is understandable), you can send it to directly to @willy (I’m probably not able to spot the issue anyway).

I’m seeing a huge amount of poll/epoll calls compared to recv/send so this makes me think about a stuck event somewhere. We’ve had two recent fixes in 1.7 for half-open connections which could trigger such an issue, I don’t know if older versions are affected as well or not. One of them affects splicing and the other one client-fin. We backported the fix to 1.6 however.

Regardless of this, it seems strange that a kernel update revealed an existing bug, and that reverting it removes the bug. So it might well be that a kernel change has some effect (maybe just some timing effect triggering the issue).

As Lukas suggested, trying 1.7.10 would help a lot. If the problem disappears, we can try to see what fix could be missing from 1.5.

Thanks, is there anything I should be aware of before jumping from 1.5.18 to 1.7.10?

I’ve got it installed on our dev and QA environments now and with the exception of a few minor config changes, all seems well. Just keen to get production up and running smoothly as well.