HAProxy community

HA Proxy worker nodes segfault when using h2 and under load

Hi,

We are using HA Proxy 2.0.9 (11/15 release, but the issue happened with earlier versions as well) and observing the worker nodes segfault under high load -

Our setup - Two java processes (JDK 8, Tomcat 8.5, HttpClient - OkHTTP) communicating via HA Proxy over Http/2 (h2c). In order to verify if this issue has anything to do with tomcat, we removed the ha proxy layer and didn’t observe any issues with connection/streams(with the client hitting the server endpoint directly).

Test Flow -

Jmeter -> Test Client (This is http 1.1 and HA Proxy is not involved)
Test Client -> HA Proxy (Test Client is using OkHTTP library for h2c calls, Spring boot process running with embedded tomcat 8.5)
HA Proxy -> Test Server (Server process is also Spring boot and running with embedded tomcat 8.5 with upgraded protocol and supports h2c)

Below is the error -

dockerd-current[2637011]: [ALERT] 287/160651 (96) : Current worker #1 (101) exited with code 139 (Segmentation fault)

This is what we are seeing in the backtrace -

#0 0x000055ba0910ad94 in b_alloc_margin (margin=0, buf=0x55ba09256380 <__compound_literal.7+128>) at include/common/buffer.h:165
160 cached = 0;
161 idx = pool_get_index(pool_head_buffer);
162 if (idx >= 0)
163 cached = pool_cache[tid][idx].count;
164
165 *buf = BUF_WANTED;
166
167 #ifndef CONFIG_HAP_LOCKLESS_POOLS
168 HA_SPIN_LOCK(POOL_LOCK, &pool_head_buffer->lock);
169 #endif
#1 h2_get_buf (h2c=h2c@entry=0x7f8aa403bdb0, bptr=bptr@entry=0x55ba09256380 <__compound_literal.7+128>) at src/mux_h2.c:436
436 unlikely((buf = b_alloc_margin(bptr, 0)) == NULL)) {
431 static inline struct buffer *h2_get_buf(struct h2c *h2c, struct buffer *bptr)
432 {
433 struct buffer buf = NULL;
434
435 if (likely(!LIST_ADDED(&h2c->buf_wait.list)) &&
436 unlikely((buf = b_alloc_margin(bptr, 0)) == NULL)) {
437 h2c->buf_wait.target = h2c;
438 h2c->buf_wait.wakeup_cb = h2_buf_available;
439 HA_SPIN_LOCK(BUF_WQ_LOCK, &buffer_wq_lock);
440 LIST_ADDQ(&buffer_wq, &h2c->buf_wait.list);
#2 0x000055ba0910d0c2 in h2c_decode_headers (h2c=h2c@entry=0x7f8aa403bdb0, rxbuf=rxbuf@entry=0x55ba09256380 <__compound_literal.7+128>, flags=flags@entry=0x55ba0925636c <__compound_literal.7+108>, body_len=body_len@entry=0x55ba09256378 <__compound_literal.7+120>) at src/mux_h2.c:3689
3689 if (!h2_get_buf(h2c, rxbuf)) {
3684
3685 hdrs += 5; // stream dep = 4, weight = 1
3686 flen -= 5;
3687 }
3688
3689 if (!h2_get_buf(h2c, rxbuf)) {
3690 h2c->flags |= H2_CF_DEM_SALLOC;
3691 goto leave;
3692 }
3693
#3 0x000055ba09111d41 in h2c_bck_handle_headers (h2s=, h2c=0x7f8aa403bdb0) at src/mux_h2.c:2145
2145 error = h2c_decode_headers(h2c, &h2s->rxbuf, &h2s->flags, &h2s->body_len);
2140
2141 if (b_data(&h2c->dbuf) < h2c->dfl && !b_full(&h2c->dbuf))
2142 return NULL; // incomplete frame
2143
2144 if (h2s->st != H2_SS_CLOSED) {
2145 error = h2c_decode_headers(h2c, &h2s->rxbuf, &h2s->flags, &h2s->body_len);
2146 }
2147 else {
2148 /
the connection was already killed by an RST, let’s consume
2149 * the data and send another RST.

This is the HA Proxy config of the server process -

frontend server_process_api_18604
bind *:18604 proto h2
mode http
option http-use-htx
use_backend server_process_api_18604

backend server_process_api_18604
balance roundrobin
mode http
option forwardfor
option http-use-htx
http-request set-header X-Forwarded-Port %[dst_port]
http-request add-header X-Forwarded-Proto http
server xxx_xxx_xx_xx_xxxxx xxx_xxx_xx_xx:xxxxx proto h2
server xxx_xxx_xx_xx_xxxxx xxx_xxx_xx_xx:xxxxx proto h2
server xxx_xxx_xx_xx_xxxxx xxx_xxx_xx_xx:xxxxx proto h2

Please file a bug on github:

Thanks, filed a bug report :

1 Like