I find it interesting to see that the analyze portion of the code is only a few tens of lines long. I’ve always read that ironbee was made to be easily portable and I think this is a proof of this statement.
It also shows that the internal API of haproxy 1.6/1.7 is much more suited to run such analysis than what we used to have previously, it may open the way to other future analysis.
I suspect that once we merge Christopher’s filters, the code could be easily ported to use them without requiring any http-request rule and could be a good validation of the model (just for experimental purposes as well).
Regarding the header parser, you could simplify it by calling http_find_full_header2(0, 0, txn->req.chn->buf->p, idx, ctx)
, it will iteratively return each header with the name in ctx->val, the name length in ctx->del
, the start of the header’s value in ctx->val
and the value’s length in ctx->vlen
. That’s apparently 22 lines which can be removed.
Regarding the accesses to /dev/random, it’s possible that some initialization code needs to be called the first time before the chroot/fork. We’ve seen this in other libs as well, they needed some random and were causing an open of /dev/random or /dev/urandom during the first call.
Out of curiosity, have you checked memory usage and performance impact ?
I think this code could serve as a nice example to show how to contribute code adding support for various libs (device identification, waf, url filters, etc). I’m not sure it would make sense to merge it in mainline considering there is little to no demand for a waf in haproxy (and we’ve even heard a few people congratulate us for not wasting resources on such features), but it depends on what people think about it and how it performs.