Image of the glider from the Game of Life by John Conway
Skip to content

Tor and the CloudFlare Problem

Before I go anywhere with this post, let me make three things very clear:

  1. I do not work for CloudFlare.
  2. I work for a small local ISP in Utah.
  3. I have been using Tor probably almost as long as many of you have been alive.

I first blogged about Tor in 2006. I had discovered it around 2004, only a couple of years after it's first release. I had used it as a way to prevent my ISP and my employer from tracking what I'm doing with my Internet connection. I would setup a simple SOCKS proxy in Firefox, then switch to it when I wanted to get on the Tor network, and switch away from it when I didn't. Oh, and you think latencies are bad on Tor now? You should have been on it back then.

Here is a metrics graph showing the time it took to download a 50 KiB file over the Tor network. Unfortunately, they don't have the data back when I started using the network, but you get a rough idea of what it was like:

Graph from metrics.torproject.org showing the latencies of downloading a 50 KiB file.

This makes a good deal of sense, because back then, ISPs didn't provide a lot of bandwidth to customers (it can be argued they still don't), and there wasn't a lot of exit nodes in the Tor network to handle the bandwidth (again, it can be argued there still isn't enough):

Graph from metrics.torproject.org showing the bandwidth of of relays in the network.

Graph from metrics.torproject.org showing the size of the Tor network

Spend some time on metrics.torproject.org looking over the historical data, and you'll get a good sense that using Tor in 2004 was a lot like getting data over dial-up. It was anything but pleasant.

What's the point? The point is, that while things can still be improved (we need more exit nodes, and we need more bandwidth on each exit node), the Tor network latencies, bandwidth, and relays is in a good position compared to 12 years ago when I started using it. So running large-scale attacks through the network is now practical.

So, where does CloudFlare fit into this? CloudFlare deploys solving captchas when you wish to consume a service behind the CloudFlare CDN. For example, while connected to Tor, visit medium.com, and you will be presented with a captcha, similar to something like this:

Screenshot of Chromium showing the need to solve a visual coptcha while trying to browse medium.com while connected to Tor.

This has gotten a lot of criticism from the "cypherpunk" millennials who feel that Tor access should be unrestricted. If you follow the "#dontblocktor" hashtag on Twitter, you will see the continued repeated criticism of CloudFlare deploying these captchas to Tor users on their CDN. Some of the arguments include:

  • Solving the captcha may only bring up another, repeatedly, never being able to consume the website in question.
  • Visually impaired users cannot solve the visual captchas.
  • Non-native English speakers will not be able to solve the audio version of the cpatcha.
  • People using browsers that disable JavaScript will not be able to reach the page.
  • There may be other security concerns where the choice of Tor is preferred over not using Tor.

No doubt, all captchas on the Web should be reconsidered. Personally, for JavaScript enabled browsers, I think forcing a proof-of-work puzzle onto the browser is transparent, and provides exactly the sort of rate-limiting needed for mitigating large-scale malicious attacks. For non-javascript puzzles, captchas seem to be the best alternative. But, I'm sure as a society, can can find alternatives to non-javascript browsers (such as network-based proof-of-work puzzles).

No doubt physical limitations, such as visual or audible impairments, can make solving a visual and audible captcha challenging, if not impossible. I don't have good solutions here except for JavaScript-based proof-of-work puzzles. But the real question that need to be addressed, is why is CloudFlare deploying captchas for Tor users?

CloudFlare addressed this due to the on-going criticism a select few on Twitter have giving the company. The blog post "The Trouble with Tor" basically comes down to the following:

  1. You must pick two between: security, anonymity, and convenience.
  2. CloudFlare is a large CDN that deals regularly with malicious traffic sourced from Tor exit relays.
  3. Captchas are a compromise, allowing Tor users to remain anonymous, while also getting access to the website.
  4. A CloudFlare CDN customer has an option in their control panel to whitelist Tor or captcha Tor.
  5. CloudFlare is investigating "blind token" proof-of-work client puzzles for something long-term.

I don't see anything unreasonable here. As a system administrator and security engineer for XMission, I understand and sympathize with CloudFlare's stance toward captachas, even if I don't agree with the implementation of the captcha itself. I have had to fight off malicious Tor traffic from our network many times during my employment, such as DNS and NTP amplification attacks, HTTP POST DDoS attacks, SQL injection and XSS attacks, and many others.

So, even as CloudFlare put it in their reasonable post, how do you allow honest Tor users with high degrees of convenience to consume the website while also minimizing and proactively mitigating malicious Tor traffic?

Again, I don't care for captchas, and wish they would die in a fire. But, what should CloudFlare do? Should they abandon the captcha altogether? If so, how should they proactively prevent malicious Tor traffic from negatively impacting their customer base? It's easy and knee-jerky to post screenshots to Twitter with the "#dontblocktor" hashtag, and shame CloudFlare and the customer using the CDN. I don't think that's the right approach, personally (nevermind that a captcha isn't a block (yes, semantics are important)). I'm curious how many of those who are reacting to CloudFlare captchas are actual system or network administrators that have to deal with these attacks. Instead, I would try to architect solutions to the problem.

Personally, I see the following:

  • Consume CloudFlare without Tor. There are no captchas, but you sacrifice a level of anonymity.
  • Consume CloudFlare behind Tor, but understand the compromise you are making to solve captchas sacrificing convenience.
  • Consume CLoudFlare beind a VPN, thus providing both anonymity and convenience.

If it really bothers you that you have to solve a captcha to reach a CloudFlare website, then rather than shaming CloudFlare, it might be worth your time to reach out to the site operator, and let them know about whitelisting Tor. If they engage in conversation, they may not have been aware of the configuration option, or they may have reasons why they want you to solve the captcha. Either way, you've come out ahead without the knee-jerking of #dontblocktor.

I guess in conclusion, while I hate captchas as much as the next guy, what would you do if you were employed by CloudFlare and in charge of this problem? What is a reasonable solution to keeping customers happy by mitigating malicious Tor traffic while also allowing honest Tor users to consume the website with high levels of convenience? Let's engage in discussion about how to create and architect these solutions, so we get as many people happy as possible- CloudFlare network admins, customers, and clients.

A final note about the term "block". The CloudFlare captcha is not blocking you from the reading the website. Instead, it's rate-limiting you. Some will argue that you get caught in endless captcha loops, consistently solving them over and over, never to actually reach the service. Personally, I have never encountered this, but others swear it exists. At most, I've had to solve 3 captchas in a row, usually because I did not solve them quick enough. I guess the effect is the same, but as already mentioned, the "#dontblocktor" hash tag is a knee-jerk, and incorrectly placed. Semantics are important, because CloudFlare is not actually blocking Tor, like Akamai does with "Access denied". It's one thing to provide a 502 HTTP error, it's quite another to rate limit requests.

{ 1 } Comments