HAProxy community

Backend redirect to s3 url of single static file (trailing slash issue)

Hi there,
I’m really struggling to find an answer to this on the forums - there’s a few answers that are close to what I’m looking for but nothing has worked so far!

So, basically I want the server IP that HAProxy is on to forward port 80 traffic to a single backend file which is located in an s3 bucket.

I can get really close with just:

frontend example
bind *:80
default_backend example

backend example
server test 192.168.X.X:80 redir https://s3.bucket.url/path/thefile.js

But forwarded traffic arrives at “https://s3.bucket.url/path/thefile.js/” and I can’t see a way of removing the trailing slash on a redirected backend - even though it’s pointing directly at a file.

I am using a compiled HA-Proxy v1.8.8

Any help/advise would be greatly appreciated!

The redir statement actually prefixes the current path with the specified URL, therefore the / at the end. (Moreover the redir statement is misused here. It is meant to redirect GET requests from an server that is POST (or other method) only, mainly used in API servers.)

Moreover do you want to redirect the user, or just serve that file through HAProxy? (If you redirect the user’s browser URL will change. If you serve through HAProxy, the user’s browser URL stays the same.)

If you want redirects you need to use redirect location statement in a frontend. (You don’t need the backend.)

On the other hand why don’t you just point that “domain” to S3 through AWS’s CloudFront? (The cost is exactly the same for you as it is now with that redirect.)

1 Like

Okay, so your suggestion for the redirect worked in the frontend:

frontend test
bind *:80
http-request redirect location https://s3.bucket.url/path/thefile.js

But I’ve had more clarity now and it seems we do need the HAProxy to serve the url/file as the client we’re supporting has no DNS and can’t handle any url requests.

Well in that case the solution isn’t that simple and it doesn’t work with redirect.

I’ll draw out the big picture, and if you have issues in implementing them let me know.

So you need:

  • (A) a frontend just like you have now, which has a default backend set to the backend discussed bellow; (in this frontend, you can put other “client facing” settings like timeouts, maximum connections, etc.; )
  • (B) you also need a backend with a server (see bellow for details);
  • (B.1) however (and here it gets tricky), because your client doesn’t support DNS, it’s very likely they don’t also support HTTP/1.1, and even if it did, the Host header it submits is wrong; therefore you need to override the Host header with the correct domain from the S3 URL; (there are two types of S3 bucket URLs, one with the bucket name in the domain, and the other as the first part of the path;) (i.e. use http-request set-header Host whatever-s3-domain);
  • (B.2) in the same backend you’ll need to also override the path to the proper value; (i.e. use http-request set-path whatever-s3-path);
  • (C ) you need a server that targets the S3 endpoint resolved by DNS (you can’t hard-code the IP as it might change, and by default it will be resolved only at startup), and you most likely need to enable TLS;
  • (D) therefore you need an HAProxy version with resolvers section support;
1 Like

Wonderful! I think that might well have worked - although I did notice in curl that it uses a HTTP/1.1, I haven’t yet tested it on the clients server so it may still work?

Rebuilt URL to: XXX.XXX.XXX.XXX/
Connected to XXX.XXX.XXX.XXX (XXX.XXX.XXX.XXX) port 80 (#0)

GET / HTTP/1.1
User-Agent: curl/7.55.1
Accept: /

< HTTP/1.1 200 OK
< x-amz-id-2:
< x-amz-request-id:
< Date:
< Last-Modified:
< ETag: “”
< Accept-Ranges:
< Content-Type:
< Content-Length:
< Server: AmazonS3

Unless I’ve done something wrong - here’s my config now (I updated to HAProxy v1.9.8 for parse-resolv-conf nameserver setting):

frontend test
        bind *:80
        default_backend test

resolvers mydns
        hold valid 10s

backend test
        http-request set-header Host path.location.amazonaws.com
        http-request set-path /thefile.js
        server serv1 path.location.amazonaws.com:80 resolvers mydns check inter 1000

I’m not sure if that last server line is correct?

The configuration seems correct, however I would suggest a few minor changes:

  • I would not enable check on the server, as it is highly unlikely that S3 will go down, and because you’ve used such a long timeout (~0.3 hours), its almost useless;
  • depending on the traffic I would suggest enabling http-reuse always;
  • also depending on your traffic, and given that you use “simple” HTTP clients, I would suggest disabling HTTP keep alive on the client side by using no option http-keep-alive (although you’ll have to see how it interact with the http-reuse option above;
  • also depending on the traffic (and especially since it has monetary impact due to pay-as-you-go approach of S3), I would strongly advise in enabling caching; (although you’ll have to manually set the Cache-Control headers in S3 as meta-data; I would suggest something like public, max-age=3600; ) (the header is not for the client but for HAProxy to enable caching; alternatively you could try to see if mangling the response header directly from HAProxy has a similar effect;)
  • also depending on the traffic I would look at performance tunning, especially related to the maximum number of connections (frontend and backend), maximum queue length, maximum timeouts, etc.;
  • I would also use ACL’s to restrict methods only to GET, and initial paths only to what you expect; (else you might become a proxy for attackers targeting S3, which although unlikely to succeed, it would at least incur monetary consequences;)
  • again, based on the threat risks, I would implement rate limiting per source IP;
  • and lastly, double check that by using amazonaws.com:80 (i.e. HTTP plain), you don’t get back an redirect to HTTPS;
1 Like