Today I had to make another entry in the “Things that have previously gone wrong” section of our Ops manual. We had users in the United Arab Emirates and Kuwait reporting that some of our apps were broken.

We had narrowed this down to certain audio and video files not loading over our AWS Cloudfront CDN. Initially I had assumed that there was some kind of content filter sitting between our users and our origin, as the problem had started on a Sunday evening and was affecting a significant number of people in the region, but only for some pieces of content.

When I used a VPN to test the files via the CDN in other locations, all was fine. I didn’t have a UAE or Kuwait endpoint, but it was obvious that it was an isssue there. I got some friendly users to screenshot a deeplink to a piece of content:

Screen shot showing a Cloudfront HTTP 502 error explaining that the content it was trying to fetch was refused, along with the request ID

I don’t know what I was expecting to see; I’d assumed it would be some kind of content filter screen - I wasn’t expecting a Cloudfront error.

Using the request ID, AWS confirmed that we were refusing connections to them. WTF? Then I remembered our fail2ban rules - somehow the region’s Cloudfront endpoint IP had gotten itself into our iptables DROP list. I cleared the tables, and using AWS’s published list of its IP addresses, added this rule to our jail.local to ensure we never block any edge IP again:

aws_cloudfront_ips = 205.    251.208.0/20 36.103.2    32.0/25 13.21    0.67.128/26 18.216.    170.128/25

ignoreip = %(daniships)s %(ukschools)s %(aws_cloudfront_ips)s

You can generate the list of IP addresses and CIDRs really easily with jq:

curl -s | jq '.prefixes[] | select(.service=="CLOUDFRONT") | .ip_prefix' | xargs

I’m assuming they’ll change over time, so I’ll need to keep the aws_cloudfront_ips updated every now and then.