Filtering malformed cookie that contains a comma

We started using a new affiliate partner that is setting cookies that contain unencoded JSON, so what we receive looks like this:
Cookie: [other cookies]; _aw_j_29283={"id":"2f44a001-378b-436d-89a2-93672703c238-1","expiration":1693705859}; [other cookies]

Our backend rejects these (properly, imho) as a 400, but this has been causing substantial support workload of people complaining our site is broken.

I tried filtering these out in my haproxy config, but the comma in the JSON semantically breaks the one Cookie header into multiple headers.

The closest I got was this:

    http-request replace-value Cookie ([^{]*)[{][^;]+? \1OBJECT
    http-request replace-value Cookie ^[\"][^;]+}(.*) BUG=CONTINUES\1

but that yielded: [other cookies]; _aw_j_29283=OBJECT,BUG=CONTINUES; [other cookies]

i.e. the comma persists, and our backend still coughs up a 400. I can turn strict headers off, but that will involve touching dozens of services. I’m stuck on haproxy 1.8. Is there any way I can just selectively nuke a cookie containing a comma in haproxy?

Thanks!

1 Like
"replace-value" works like "replace-header" except that it matches the
regex against every comma-delimited value of the header field <name>
instead of the entire header.

It will never match a comma because it’s using commas as a delimiter.

You’d need to use replace header instead, something like:

http-request replace-header Cookie '(.+)\{[^\}]+\}(.+)' \1REPLACED\2

This would replace all JSON strings with REPLACED.

1 Like

ooh, awesome, missed that replace-header works on the whole thing!

Just out of curiosity, wouldn’t that regex only replace one single JSON cookie in the header? If it’s like sed /g then I’d think you’d need to isolate it down to the per-cookie level instead of allowing it to match the whole thing.

Well my first try was just '\{[^\}]+\}' REPLACED without anything else, which in my mind would have matched the JSON and replaced it, even multiple times, and not touch everything else.

However when actually testing, it ended up replacing the entire header value with REPLACED (so erasing all cookies). I don’t really understand why.

But yeah, you’re right, if you have multiple cookies like that, it wouldn’t work. In that case, we would have to troubleshoot while the header replacement is erasing it all.

I guess it passes the “does match” test even with a partial match, and unless you capture and emit the other parts, you lose them.

Here’s where I landed; instead of focusing on the curly brace, which seems to be legal

 cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
 cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                       ; US-ASCII characters excluding CTLs,
                       ; whitespace DQUOTE, comma, semicolon,
                       ; and backslash

I just look for any cookie value that contains excluded chars, flag them, and truncate up to the end of this cookie. As we discovered, it only applies to the first one, but you can repeat the rule however many times you feel like being tolerant of crap requests:

    http-request replace-header Cookie (.+?=[^;\\,\"]*)[\\,\"][^;]*(;.*)? \1_ILLEGAL_CHARS_1\2
    http-request replace-header Cookie (.+?=[^;\\,\"]*)[\\,\"][^;]*(;.*)? \1_ILLEGAL_CHARS_2\2
    http-request replace-header Cookie (.+?=[^;\\,\"]*)[\\,\"][^;]*(;.*)? \1_ILLEGAL_CHARS_3\2
    http-request replace-header Cookie (.+?=[^;\\,\"]*)[\\,\"][^;]*(;.*)? \1_ILLEGAL_CHARS_4\2

This handles both the “mid-cookie-list” case and the “last-cookie” case for up to 4 bad cookies.

1 Like

(From a perspective of “least surprises”, I think a /g style behavior would be more intuitive, but I can see how that could be tricky.)