Gather more info about TLS Client Hello messages being received on port 80

Hi all,

I’m having an issue with Facebook’s crawler sending me repeated TLS Client Hello messages on port 80. I logged an issue with their developer support but they closed the ticket and stated that they need more information about what is happening. They can’t find anything in their logs that would explain what is happening and no other developer seems to be having this problem (this could suggest that the problem is on my side).

I am receiving about 600 of these requests per hour from Facebook’s crawler.

I managed to capture some of these requests using tcpdump:

10622 15.837038 31.13.127.5 MYSERVERIP TCP 66 47658 → 80 [ACK] Seq=1 Ack=1 Win=61440 Len=0 TSval=1921577847 TSecr=59275252

10701 15.848790 31.13.127.5 MYSERVERIP TCP 583 47658 → 80 [PSH, ACK] Seq=1 Ack=1 Win=61440 Len=517 TSval=1921577859 TSecr=59275252

10702 15.848846 MYSERVERIP 31.13.127.5 HTTP 253 HTTP/1.0 400 Bad request (text/html)

10914 15.927603 31.13.127.5 MYSERVERIP TCP 66 47658 → 80 [FIN, ACK] Seq=518 Ack=189 Win=63488 Len=0 TSval=1921577937 TSecr=59275274

10915 15.927611 MYSERVERIP 31.13.127.5 TCP 66 80 → 47658 [ACK] Seq=189 Ack=519 Win=30080 Len=0 TSval=59275294 TSecr=1921577937

12044 17.419319 31.13.127.5 MYSERVERIP TCP 74 53712 → 80 [SYN] Seq=0 Win=61320 Len=0 MSS=1460 SACK_PERM=1 TSval=1921579431 TSecr=0 WS=2048

12045 17.419337 MYSERVERIP 31.13.127.5 TCP 74 80 → 53712 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=59275667 TSecr=1921579431 WS=128

12125 17.493182 31.13.127.5 MYSERVERIP TCP 66 53712 → 80 [ACK] Seq=1 Ack=1 Win=61440 Len=0 TSval=1921579505 TSecr=59275667

12126 17.501269 31.13.127.5 MYSERVERIP TCP 583 53712 → 80 [PSH, ACK] Seq=1 Ack=1 Win=61440 Len=517 TSval=1921579513 TSecr=59275667

12127 17.501387 MYSERVERIP 31.13.127.5 HTTP 253 HTTP/1.0 400 Bad request (text/html)

12179 17.576974 31.13.127.5 MYSERVERIP TCP 66 53712 → 80 [FIN, ACK] Seq=518 Ack=189 Win=63488 Len=0 TSval=1921579589 TSecr=59275687

What these errors look like in my haproxy log:

Oct 1 19:46:00 LB haproxy[19022]: 69.171.251.8:57356 [01/Oct/2018:19:46:00.903] sitename sitename/ -1/-1/-1/-1/5 400 187 - - PRNN 19/19/0/0/5 0/0 “”

An example of one of the errors:

invalid request
backend mysite (#2), server (#-1), event #127
src 69.171.251.1:61042, session #9717, session flags 0x00000080
HTTP msg state 26, msg flags 0x00000000, tx flags 0x00000000
HTTP chunk len 0 bytes, HTTP body len 0 bytes
buffer flags 0x00808002, out 0 bytes, total 517 bytes
pending 517 bytes, wrapping at 32776, error at position 0:

00000 \x16\x03\x01\x02\x00\x01\x00\x01\xFC\x03\x03 B\x9B\xF8\xAE\xFB=\xD7dN
00021+ \x8D\xAD\xCCP\x99\x9C\xEEow#w\n
00033 \xB5\x99\x16g@\x1F{\x9A5H\x00\x00\xAA\xC00\xC0,\xC0(\xC0$\xC0\x14\xC0
00057+ \n
00058 \x00\xA5\x00\xA3\x00\xA1\x00\x9F\x00k\x00j\x00i\x00h\x009\x008\x007
00080+ \x006\xCC\xA9\xCC\xA8\xCC\x14\xCC\x13\xCC\xAA\xCC\x15\x00\x88\x00\x87
00098+ \x00\x86\x00\x85\xC02\xC0.\xC0*\xC0&\xC0\x0F\xC0\x05\x00\x9D\x00=\x005
00120+ \x00\x84\xC0/\xC0+\xC0’\xC0#\xC0\x13\xC0\t\x00\xA4\x00\xA2\x00\xA0\x00
00141+ \x9E\x00g\x00@\x00?\x00>\x003\x002\x001\x000\x00\x9A\x00\x99\x00\x98
00164+ \x00\x97\x00E\x00D\x00C\x00B\xC01\xC0-\xC0)\xC0%\xC0\x0E\xC0\x04\x00
00187+ \x9C\x00<\x00/\x00\x96\x00A\xC0\x12\xC0\x08\x00\x16\x00\x13\x00\x10
00206+ \x00\r\xC0\r\xC0\x03\x00\n
00214 \x00\xFF\x01\x00\x01)\x00\x00\x00\x14\x00\x12\x00\x00\x0Ffb.mysite.c
00242+ om\x00\x0B\x00\x04\x03\x00\x01\x02\x00\n
00254 \x00\x1C\x00\x1A\x00\x17\x00\x19\x00\x1C\x00\e\x00\x18\x00\x1A\x00\x16
00272+ \x00\x0E\x00\r\x00\x0B\x00\x0C\x00\t\x00\n
00284 \x00\r\x00 \x00\x1E\x06\x01\x06\x02\x06\x03\x05\x01\x05\x02\x05\x03
00302+ \x04\x01\x04\x02\x04\x03\x03\x01\x03\x02\x03\x03\x02\x01\x02\x02\x02
00319+ \x033t\x00\x00\x00\x10\x00\x0B\x00\t\x08http/1.1\x00\x15\x00\xAE\x00
00344+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00361+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00378+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00395+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00412+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00429+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00446+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00463+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00480+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00497+ \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
00514+ \x00\x00\x00

I went to Server Fault with this issue at the start of the month. My configuration is pretty much the same as it was on one of my last topics on this forum.

Is there any way for me to debug this issue further?

Thanks,
Wayne

Sends the full and complete captures to Facebook. There is nothing anybody else can do about this.

1 Like

Unfortunately, their engineering team responded and said that there are no logs indicating that these requests are coming from their side. They asked that I “decode” the content of the bad requests using my certificate and provide more details on what I am seeing.

Is there anyway for me to gather more info on these requests? i.e. Referrals, the exact URL the crawler is looking for, etc.

Sure, that’s what I suggested in the last thread:

Read the thread about content switching between HTTP and SSL on a single port and then you can decode it in haproxy and log everything you like.

Hi Lukas,

Not sure how I didn’t see that originally!

I’m using a listen block for both 80 and 443. Would something like this work?

acl plainport dst_port 80
tcp-request inspect-delay 2s
use_backend wrongport if { req.ssl_hello_type 1 } plainport

My idea would be to send SSL traffic coming in over port 80 to a backend called wrongport and then log details about it. Will a tcp-request inspect-delay of 2s work if the listen block is in http mode?

I tried the above config and it didn’t seem to do anything.

Thanks,
Wayne

Based on the configuration in the other thread I’d suggest:

  • move (both ipv4 and ipv6) port 80 bind instructions to another frontend
  • replacing them with a local listener (127.0.0.1:1080 + proxy protocol for source IP transparency)
  • add new TCP frontend for content-switching between HTTP and TLS
  • add two new TCP backends for looping the traffic back to a HTTP frontend (unencrypted -> 127.0.0.1:1080 and HTTPS -> 127.0.0.1:1443)
  • add another HTTP frontend recieving the bogus HTTPS traffic on a local port (127.0.0.1:1443)

Which should make it look like this:

listen mysite
    option httplog
    option dontlog-normal
    option dontlognull
    option accept-invalid-http-request
    log /dev/log local0
    bind 127.0.0.1:1080 accept-proxy
    bind *:443 ssl crt /etc/ssl/mysite.com/mysite.com.pem
    mode http
    maxconn 70000
    balance static-rr
    option http-keep-alive
    option forwardfor
    http-request set-header X-Forwarded-Proto HTTPS_ON if { ssl_fc }
    cookie SRVNAME insert
    timeout connect  10s
    timeout client  60s
    timeout server 60s
    reqidel ^X-Forwarded-For:.*

    redirect scheme https code 301 if !{ ssl_fc }

    acl fb-img-acl hdr_dom(host) -i fb.mysite.com
    use_backend varnish-backend if fb-img-acl

    acl thumb-img-acl hdr_dom(host) -i thumbs.mysite.com
    use_backend varnish-backend if thumb-img-acl

    acl letsencrypt-acl path_beg /.well-known/acme-challenge/
    use_backend letsencrypt-backend if letsencrypt-acl

    server mysite01 10.136.109.25:80 cookie MS01 check
    server mysite02 10.136.126.250:80 cookie MS02 check
    server mysite04 10.136.127.19:80 cookie MS04 check
    server mysite05 10.136.127.60:80 cookie MS05 check
    server mysite06 10.136.63.133:80 cookie MS06 check

backend letsencrypt-backend
    server letsencrypt 127.0.0.1:8888

backend varnish-backend
    server varnish 127.0.0.1:6081
    
    

frontend port80
    mode tcp
    bind 0.0.0.0:80
    bind :::80 v6only

    tcp-request inspect-delay 5s
    tcp-request content accept if { req.ssl_hello_type 1 }
    tcp-request content accept if HTTP
    
    use_backend tcp_tls_loopback if { req.ssl_hello_type 1 }
    use_backend tcp_http_loopback if HTTP
    

backend tcp_tls_loopback
    mode tcp
    server loopbackBogusHTTPS 127.0.0.1:1443 send-proxy
    
backend tcp_http_loopback
    mode tcp
    server properhttp 127.0.0.1:1080 send-proxy
    
frontend bogus443
    mode http
    bind 127.0.0.1:1443 ssl crt /etc/ssl/mysite.com/mysite.com.pem accept-proxy
    # do whatever you want with the encrypted HTTPS traffic here (logging your forwarding normally marking the request with something specific
1 Like

Wow, thank you! That configuration is working great! I was able to add a custom header and forward the request onto my application so that I could do some custom logging. Seems like Facebook’s crawler is definitely bugging out. It sends simultaneous requests in bursts for image resources. All of the requests come from Facebook IP addresses and they all have a blank user agent. There’s currently an ongoing ticket about the blank user agent issue so maybe the two are related.

Thank you very much for this!