Strange caching behavior

Hello,

We use HAProxy 3.0 as a frontend of custom Dolibarr-based ERP (created with PHP).
Recently I tried to activate cache at HAProxy level. I used ACLs to cache only some content types (see my configuration below). The result was really strange. The cache hits were correctly present, but application sometimes switched authenticated users - a session was started by ‘user1’ but some time later the username was switched to ‘user2’ (also active at this moment). The sessions are managed by PHP (cookies). What did I do wrong? How can I improve the configuration to avoid such switching?

My HAProxy configuration is as follows:

cache maincache
    total-max-size 512
    max-object-size 1048576
    max-age 600
    process-vary off

frontend www-https
...
http-response set-header X-Cache-Status %[var(txn.cache)]
...

backend dynamic
...
http-response set-var(txn.ctype) res.hdr(content-type)
http-response set-var(txn.cache) str("HIT") if !{ srv_id -m found }
http-response set-var(txn.cache) str("MISS") if { srv_id -m found }
acl is_javascript var(txn.ctype) -m sub -i javascript
acl is_css var(txn.ctype) -m sub -i css
acl is_image var(txn.ctype) -m sub -i image
http-request cache-use maincache
http-response cache-store maincache if is_javascript || is_css || is_image

You will have to find out what haproxy is caching that it is not supposed to cache and why (by looking at the headers). That’s is not something we can do for you.

I would refrain from using substring matches for content-type, and use exact matches instead.

Likely your backend does not set correct Cache-Control/Pragma headers.

Thanks, lukastribus
I perfectly understand that I need to find what content should not be cached but it was.

The problem is that HAProxy does not give much information about the objects put into cache. I can only see the hashes and expiration time. Is there any way to see what exactly was put in cache (the value from which the hash was calculated)? Does HAProxy put in cache the headers like set-cookie? Do the URL parameters participate in hashes?

I suppose that Cache-Control is not set to permit caching at browser level.

I would start by using the developer console (F12) in the browser to see headers that come from haproxy.

You can compare the headers after disable haproxy caching, or maybe even by accessing the backend directly without haproxy.

Or you could tcpdump plaintext http traffic between haproxy and the backend application and analyze it then.

You cannot dump an object by using it’s hash, but you can take a look at your logs, find out what URIs you are accessing, and dump the content by using curl, just making the same request. You can see by logs and response headers whether you are server from cache or not.

Haproxy needs to store all headers, it cannot simply omit some headers. I recon that when things go wrong, set-cookie headers are also cached, yes.

The cache uses a hash of the host header and the URI as the key, so yes, that parameters participate in hashes.

OK, thanks, I’ll investigate.