Trouble using path_reg with ACLs in HAProxy (regex seems to break everything)

Hi everyone,

I’m currently running HAProxy in a Docker container using the official haproxy:lts-bookworm image. My setup is working fine for basic ACLs using path and hdr(host), but I’m running into serious trouble when trying to use regex-based ACLs with path_reg.

I’m exposing multiple subdomains (e.g. example.org, admin.example.org, backend.example.org) and using HAProxy to control access. The goal is to restrict access to certain routes on one subdomain (backend.example.org) based on path and source IP.

Some of the routes are dynamic (they include IDs or tokens in the URL), so I’m trying to use regex ACLs like:

acl path_acl1 method GET path_reg -i ^/app/resource/[a-zA-Z0-9-]+$

Then I route like this:

use_backend backend_server if is_backend_host path_acl1 is_internal_network
default_backend access_denied

The problem

Whenever I start using path_reg, even with a simple pattern, HAProxy seems to stop evaluating the other path-based ACLs properly. In fact, sometimes none of the ACLs match anymore, and requests that should be blocked end up allowed.

I’ve tested several variations:

  • Escaping vs not escaping slashes
  • Using -i vs not
  • Simplified patterns like ^/test/.*$
  • Reordering ACLs and use_backend statements

But nothing seems to work reliably, and sometimes a single regex seems to break everything.

I confirmed that the HAProxy build inside the container includes:
OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1
Built with PCRE2 version : 10.42 2022-12-11
PCRE2 library supports JIT : yes

My questions

  1. Is there anything special about how path_reg is parsed inside HAProxy?
  2. Are there known issues when using regex ACLs with method in the same line?
  3. Could a malformed regex silently break evaluation of all ACLs?
  4. Is there a good way to debug which ACLs matched (or didn’t) for a request?

If needed, I can share a simplified version of my HAProxy config. I’m just not sure where the problem lies, and I’ve hit a wall.

Any help or insight would be greatly appreciated.

Thanks!

For more conf:

frontend https_in
    bind *:443 ssl crt ... ssl-min-ver TLSv1.2
    mode http

    acl is_internal_network src 10.0.0.0/8 192.168.0.0/16

    acl is_backend_host hdr(host) -i backend.example.org

    acl allow_static_path path -i /api/health /logout
    acl allow_dynamic_path1 method GET path_reg -i ^/api/items/[a-zA-Z0-9-]+$
    acl allow_dynamic_path2 method POST path_reg -i ^/api/items/[a-zA-Z0-9-]+/comments$
    
    use_backend backend_server if is_backend_host allow_static_path is_internal_network
    use_backend backend_server if is_backend_host allow_dynamic_path1
    use_backend backend_server if is_backend_host allow_dynamic_path2

    default_backend deny_backend

acl allow_static_path path -i /api/health /logout

Here you have an ACL that checks wheter path is either /api/health or /logout. What you are configuring matches your intention. You know that this is a list of strings.

However here:

acl allow_dynamic_path1 method GET path_reg -i ^/api/items/[a-zA-Z0-9-]+$

You are checking whether method is once of 4 strings, those 4 strings are GET, path_reg, -i and ^/api/items/[a-zA-Z0-9-]+$.

This is of course not what you want.

One ACL statement really is one ACL statement.

METH_GET/METH_POST are builtin ACLs that are equivalent to a method GET or method POST evaluation, so you can directly use them.

You will have to cleanup your dynamic ACLs though. For example:

acl is_internal_network src 10.0.0.0/8 192.168.0.0/16

acl is_backend_host hdr(host) -i backend.example.org

acl allow_static_path path -i /api/health /logout
acl allow_dynamic_path1 path_reg -i ^/api/items/[a-zA-Z0-9-]+$
acl allow_dynamic_path2 path_reg -i ^/api/items/[a-zA-Z0-9-]+/comments$

use_backend backend_server if is_backend_host allow_static_path is_internal_network
use_backend backend_server if is_backend_host METH_GET allow_dynamic_path1
use_backend backend_server if is_backend_host METH_POST allow_dynamic_path2

I would strongly suggest to go through section 7 of the documentation on how to use ACLs.

Wow, thank you so much Lukas!

I can’t express how helpful your message was — I had completely misunderstood how HAProxy parses multiple keywords on the same ACL line. I naively thought I could combine method and path_reg like that, but obviously, I was just feeding it a list of strings to match — and none of it was doing what I thought it was :sweat_smile:

Thanks to your clear explanation and your example with METH_GET / METH_POST, I restructured all my ACLs, and it immediately started working the way I intended. I’ve been banging my head against this for days, and you solved it in one message — huge respect!

Thanks again, really :folded_hands:

1 Like