H2 frontend, http1.1 backend and crippled content

Hey,

today I tried haproxy 1.8-rc1 with h2 and an http/1.1 backend (varnish 4.x). After first tests with various browsers which complained about content encoding with js/css files and half loaded images, I realized that with h2 the content is crippled.

curl --http1.1 -s -H "Accept-encoding: gzip" https://assets-staging.dieblaue24.com/js/app-a2efa168d5e20b3ddec0dfdd7bdf7754.js | gunzip - && echo ok || echo error
ok

curl --http2 -s -H "Accept-encoding: gzip" https://assets-staging.dieblaue24.com/js/app-a2efa168d5e20b3ddec0dfdd7bdf7754.js | gunzip - && echo ok || echo error
...
gzip: stdin: unexpected end of file
error
haproxy -vv
HA-Proxy version 1.8-rc1-82913e4 2017/11/01
Copyright 2000-2017 Willy Tarreau <willy@haproxy.org>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-null-dereference -Wno-unused-label
  OPTIONS = USE_LINUX_SPLICE=1 USE_LIBCRYPT=1 USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.1.0f  25 May 2017
Running on OpenSSL version : OpenSSL 1.1.0f  25 May 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with network namespace support.
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Encrypted password support via crypt(3): yes
Built with PCRE version : 8.39 2016-06-14
Running on PCRE version : 8.39 2016-06-14
PCRE library supports JIT : no (USE_PCRE_JIT not set)

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
  [SPOE] spoe
  [COMP] compression
  [TRACE] trace

and according dockerfile

FROM bitnami/minideb:stretch as build

RUN install_packages build-essential libssl-dev wget curl ca-certificates zlib1g-dev libpcre3-dev

ENV HAPROXY_VERSION haproxy-ss-20171101

RUN cd /var/tmp \
  && wget http://www.haproxy.org/download/1.8/src/snapshot/${HAPROXY_VERSION}.tar.gz  \
  && tar xvzf ${HAPROXY_VERSION}.tar.gz  \
  && cd $HAPROXY_VERSION \
  && make TARGET=linux2628 USE_LIBCRYPT=1 USE_LINUX_SPLICE=1 USE_OPENSSL=1 USE_PCRE=1 USE_ZLIB=1 \
  && make install \
  && rm -fr /var/tmp/haproxy*

FROM bitnami/minideb:stretch

RUN install_packages openssl busybox-syslogd gettext zlib1g libpcre3

COPY --from=build /usr/local/sbin/haproxy /usr/local/sbin/haproxy

Yeah I noticed H2 is corrupting payload as well, I reported it last night:
https://www.mail-archive.com/haproxy@formilux.org/msg27665.html

I don’t think it depends on gzip encoding, its probably just that the corruption is easier to spot via the gzip decoder.

This was fixed in 4b75fffa2 (“BUG/MAJOR: buffers: fix get_buffer_nc() for data at end of buffer”).

You can pull from latest git or apply the patch:
http://git.haproxy.org/?p=haproxy.git;a=patch;h=4b75fffa2bb0c60f26affe3e784956a0b8087442

Yes, can confirm :clap: :100:

Not sure if it’s the same problem here.
Same config : H2 frontend => http1.1 backend.

My haproxy -vv (with the h2 patch) :

And when I curl my h2 frontend:

The html content seems not corrupted, but I can’t load the page in my browsers:
NS_ERROR_NET_INADEQUATE_SECURITY in firefox
ERR_SPDY_INADEQUATE_TRANSPORT_SECURITY in chromium

Here is my Haproxy frontend configuration:

Do you think it’s the same problem? Do I need to create a new Topic?

Thank you a lot !

No, this is just a configuration problem.

The inadequate security messages from browser imply that something is wrong with your cipher configuration. HTTP2 does not allow certain ciphers, see RFC7540 Appendix A [1].

You are probably using a RSA certificate, and you ciphers configuration only allows the following ciphers for RSA certificates, all of them blacklisted in HTTP2:

OpenSSL: IANA
ECDHE-RSA-AES256-SHA384: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384	
ECDHE-RSA-AES256-SHA: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA	
DHE-RSA-AES256-SHA: TLS_DHE_RSA_WITH_AES_256_CBC_SHA

Pick cipher suites from the Mozilla recommendation [2], from the modern or intermediate compatibility. This will permit ECDHE-RSA-AES*.GCM and ECDHE-RSA-CHACHA20-POLY1305 ciphers, which the browser will then use.

If browsers then work, but curl still shows this error then please report it in a new thread.

[1] https://tools.ietf.org/html/rfc7540#appendix-A
[2] https://wiki.mozilla.org/Security/Server_Side_TLS

Thank you for your response. I took the “Intermediate compatibility” cipher suite and now it works with firefox.
But same problem with curl and all requests with chromium are in “pending”.

Ok, please report your issue along with a detailed report to the mailing list, you can just send your request to (no need to subscribe):
haproxy@formilux.org

Can you try with noreuseport in your global section? That is just to make sure that there are not multiple haproxy instances running. If haproxy fails to start when using noreuseport, it means that you have an old haproy instance running which may cause this issue.

Please include in your report:

  • the output of haproxy -vv
  • the entire configuration
  • the debug output of the failing curl call in verbose mode (like curl -vv https://yoursite.com/ >content - we don’t care about the content, just the debug output)
  • some additional information about the content

OT:

@lukastribus

So what’s the usual way to report issues? Filing them here in the forum and you filter out if it’s worth to post it to the mailing list again? :wink: It’s a bit confusing and annoying to search two sources for maybe known issues. I’d prefer using github for that (it’s 2017).

We have been talking about setting up a github repository to use its issue tracker for bugs, but for now, that has not happened.

The development happens on the mailing list only, and is therefor the proper way to report bugs and feature request. The mailing list is also where most advanced users are and where you most likely get a good educated answer to any questions.

And really if you are using -dev or -rc, you should be on the mailing list and on the mailing list only.

If I understand correctly, this discourse forum was mostly intended for users to share sample configurations and LUA scripts, as the mailing list does not really excel at that [1]. Now after 2 years we have 1x configuration sample and a handful of LUA scripts, everything else are basically one-time posters asking for configuration advice (often without reading the documentation or making any kind of effort). The discourse ecosystem does not properly work for haproxy in my opinion, especially when used “parallel” to the mailing list.

Flagging @willy

[1] https://www.mail-archive.com/haproxy@formilux.org/msg20943.html

Well, I can’t tell whether it works well or not, however what I’m absolutely certain about is that if it works, it’s 100% thanks to you given that you respond to every single post, so thanks a lot for doing that.

I just think that there are different populations. Just like there are people following kernel-related stuff on the mailing lists, others on LWN and others seeking help on serverfault, I think there are different expectations that need to be addressed.

At the moment we have two discussion channels. That’s little compared to some projects but it’s a lot for a moderately small project like ours. So in the end some users don’t find exactly the format they’re looking for, and we can hardly task more people to follow and animate more channels in various formats.

Despite this I noticed that we don’t see people complain anymore about how to serve them, so we’ve probably reached what we need to reasonably satisfy most users by adding discourse.

I totally agree on the fact that we need to complete the github migration, especially for the issues. The thing is that I’m often identified as the obvious person to deal with such things, but there are 3 important things to consider to understand why each time I have to be associated with such operations it ends up poorly :

  • I’m terribly bad at anything more or less administrative and with processes. To give you an idea it took me one year to cancel my ISP subscription after I moved, just because for me it is complicated to find how to do this, so I give up.
  • I have a horrible relation with browsers (I would say they have a horrible relation with me because apparently they are designed just to annoy me specifically). Even here I’m typing this response in a highly inefficient 3cm*9cm text area while the window is in full screen, being very careful not to accidently click anywhere etc, and I’ll copy-paste it into a temp file before posting just in case it gets lost so that I don’t have to retype it. Yes it’s a huge pain for me.
  • it takes me a long time to learn and adapt to new tools and processes (strangely, see first point above :-))

Being on the critical path of the project, I can’t conceive to waste so much time dealing with stuff that is so complicated for me while other people have much less trouble with them.

I consider we’re a community : developers, users, bug reporters, people helping on their free (or even work) time, people writing blog articles to explain how to use certain features, those providing free service like discourse, people dealing with the minimally needed infrastructure and tools, often behind the curtains, like you, Cyril, Adrian and Benoit are doing for example.

I’m all for encouraging the community to explore different tools that people see fit. However my responsibility is also to warn people against the risk of fragmentation. At least we should avoid using different tools for the same task, especially when content is stored. An example is issue trackers. We don’t want to force people to use multiple issue trackers. And I consider that the opinion of the people who spend a lot of time dealing with issues is orders of magnitude more important than the one of those reporting a bug and leaving without even saying thanks sometimes (and I’m among those dealing with bugs so I claim my voice here). For plain discussions it’s a bit different. When you have to meet a friend you can meet him at many places, at your home or his. Here it’s the same, we must not force people to go to a single place. But we must be clear to them about the type of services they can expect to find at various places.

What I think regarding our tools is that :

  • people reading discourse regularly are those regularly willing to poll for news, or having spare time to read this. It’s very convenient for first-time reporters, and it’s convenient to index config excerpts. It’s not suited to report bugs but it’s suited to decide if what they observe is a bug or not ;
  • people on the mailing list are those willing to be asynchronously notified about the project’s progress and news, and willing to quickly be able to sort out what they’re interested in and what not. Participating to design discussions is much easier there, you’re not forced to use a slow crippled browser, you can quote some sentences in replies etc, making the conversations more natural and much faster. Archives there exist now (mail-archive.com) but are less convenient to search and link to than articles on the forum.
  • for issues, something like github’s issue tracker is really good. While I really hate their infamous web interface in general, for issues it’s more or less correct, and it supports mail back-and-forth, which is critical for such tools, while not that common. However we have to be aware that once we reopen it, it will become a discussion place because users will report usage trouble there considering they’re hitting bugs all the time. So it will require more manpower to keep it clean. But there might be options we could discuss with the github team regarding this. Eg if only a few people could open an issue simply by forwarding an e-mail from the mailing list after a discussion confirming the bug, it would be really cool to limit unexpected pollution, and at the same time encourage users to explain their problem (and possibly get a quick response). An issue tracker should first be a TODO list with some annotations and links to original reports, not a discussion place.

So these are mainly my thoughts on the subject. We’re still too few to share the high load of help and reports, and I really think that this also goes with crediting people for their involvement or giving them more responsibility (which will also offload us). There are roughly 10 persons on the list+discourse helping 1000 persons all the time and this is a thankless job. I don’t know how we can make this situation better. At least being recognized for their efforts is the minimum we can do and the minimum they deserve.

But seeing this from a more positive perspective, it’s what you get with a successful project handled by a small competent community. Users mostly complain about the lack of clarity in our escalation process but are often satisfied with the time we take to fix their issues. And when you think about it, some bugs are fixed in a few hours between the report here, your analysis, your forward to the ML, another developer jumping on it, troubleshooting it, fixing it, sending me the fix so that I merge it, and then you respond with the commit ID and the report confirms the fix. It’s awesome! This proves that our model is quite efficient for end users. It’s probably still harassing for the persons in the chain like you and a few others, but thinking about it for a few seconds, we would probably not estimate to ~1 million the number of deployments with a different process. And the ML, you and discourse are a significant part of this successful chain, which is why I can hardly say that it doesn’t work.

Oh by the way I think it will be archived in a totally unrelated topic here, so apparently off-topic also exists on discourse :slight_smile:

What I mean when I say it doesn’t work is that there is not a community building around this forum, in 2 years there are 5 users with more than 20 replies and about 15 users with more than 10 replies.

There is no ecosystem is what I’m saying. Also there is a big amount of low quality questions and it can get frustrating sometimes … especially when no one else answers. But I’m aware I’m preaching to the choir.

Yes, people get answers and bugs are getting fixed - support does work.

I know. I think what we need is to continue the conversation that started some time ago on the mailing list, and see who can do what. I would certainly volunteer for some of the github related tasks, as we already discussed previously.

I’m sure we can limit your exposure to browsers to a minimum, but I believe it is important that you do have full administrative access to those projects, even if their primary interface is in a browser.

Also it would be good if all the code maintainers could sign-up on github, so that we can actually assign bugs to them.

Yes, it will require guidelines for users, and it will require moderation. But as opposed to an open mailing list, on github we can show a template with guidelines to users that are about to open new issues (see below), and we can close issues where users clearly ignore them.

I’m not sure about this one. If we allow the issue tracker only for a few selected people, the issue tracker is basically read-only for the public and we still handle everything through existing support channels.

It do think that with moderation and guidelines we can handle a public issue tracker (we can use github templates [1] for example, they do need actual files in the git repository though).

Either way, this discussion needs to be continued in the proper thread in the mailing list :slight_smile:

Yes we are way off topic here :slight_smile:

[1] Issue and Pull Request templates - The GitHub Blog