Hello! I’m trying to use haproxy to cache an API response that includes a last-modified header.
I must be missing something in my configuration, because It appears that haproxy doesn’t query the backend to validate the cached response. Instead, it responds with the cached response until max-age expires.
Here are the bits from my haproxy config:
cache api-cache
total-max-size 200
max-object-size 2000000
max-age 1200
process-vary on
frontend www-https
acl cacheable_api path -m beg /api/type-info
http-request cache-use api-cache if cacheable_api
http-response cache-store api-cache
Here is a request that shows the haproxy response:
✸ http --print=Hh --verify no https://my-server/api/type-info
GET /api/type-info HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate, zstd
Connection: keep-alive
Host: my-server
User-Agent: HTTPie/3.2.4
HTTP/1.1 200 OK
age: 450
content-language: en-US
content-type: application/json;charset=UTF-8
date: Thu, 12 Jun 2025 11:18:20 GMT
last-modified: Thu, 12 Jun 2025 11:18:20 GMT
transfer-encoding: chunked
x-cache: Hit from haproxy
In the response above, haproxy is responding with data that should be invalidated. On the backend server, the last-modified time has changed by two seconds. I can see this if I set a bogus Authorization header to avoid the haproxy cache:
✸ http --print=Hh --verify no https://my-server/api/type-info Authorization:foo
GET /api/type-info HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate, zstd
Authorization: foo
Connection: keep-alive
Host: my-server
User-Agent: HTTPie/3.2.4
HTTP/1.1 200 OK
content-language: en-US
content-type: application/json;charset=UTF-8
date: Thu, 12 Jun 2025 11:21:12 GMT
last-modified: Thu, 12 Jun 2025 11:18:22 GMT
transfer-encoding: chunked
x-cache: Miss from haproxy
Am I missing a piece of configuration that would tell haproxy to call the backend with if-modified-since every time, to ensure the cache is valid?
Never tried to cache http traffic with a haproxy - cool that this feature is included.
My last caching experience is a while ago, but as I understood, haproxy is working as intended.
In the “cache api-cache” section you defined “max-age 1200”, so an age with 450 should be accepted to be valid and has not to be pruned.
Haproxy will prune this element, if the age of the cached element is greater than max-age and will then request through to the backend to fetch a new, fresh generated element, with a new “last-modified” header.
At least, that would be the way, I would it expect to work.
The behavior you describe only allows haproxy to blindly cache the server response. Haproxy can reply 200 or 304 to the browser, but that exchange should also work between haproxy and the backend.
As far as I can tell, haproxy only provides blind expiration-based caching of the backend response. It handles the last-modified validator for responding to the client, but doesn’t implement it for fetching from the server, which is what I want.
Consider that setting max-age 1200 means that haproxy won’t call the server in that span of time. When the time expires, it flushes its cache and calls the server without specifying if-modified-since.
This makes haproxy’s caching useless for scenarios that are expensive on the server but still need validation before replying to the browser client.
What you are looking for is what you would enable in a Cache-Control header with a proxy-revalidate or must-revalidate value. This would work in a fully fledged HTTP caching proxy, like - I’m assuming - varnish or squid.
However haproxy is not a fully HTTP compliant cache. It is a simplistic caching implementation that servers a limit amount of use-cases, like caching favicons or css files, as per the documentation.
I’m wondering how the server would validate a if-modified-since request in an non-expensive way though. Isn’t the validation of the request to come to the conclusion for a 304 Not Modified just as expensive as the normal request just without the data-transfer?