"Web cache" refers to any technology that fronts an origin web server and temporarily stores frequently accessed content so that subsequent requests for the same content can be served efficiently. Be they centralized caching proxies deployed on-premises at an enterprise or content delivery networks (CDNs) with massively distributed caching edge servers, caches have become critical Internet infrastructure that enable scalable traffic delivery.
Attacks targeting caches are nothing new. However, it wasn't until 2017 that web cache attacks saw a significant surge in popularity, with novel exploits regularly making the headlines. Works such as "Web Cache Deception Attack," "Practical Cache Poisoning," and "CPDoS: Cache Poisoned Denial of Service" demonstrate disastrous vulnerabilities that are easy for miscreants to exploit.
In our own research with academics from the University of Trento and Northeastern University, we homed in on the aforementioned web cache deception attack, or WCD for short. WCD is a particularly damaging threat, where the adversary tricks a cache into storing the victim's sensitive data, therefore leaking it on the Internet. We analyzed 340 popular websites and found that 37 were affected by WCD, also finding that simple tweaks to existing attack techniques are sufficient to discover new exploitable targets. (We will present this work, titled "Cached and Confused: Web Cache Deception in the Wild," at Usenix Security Symposium in August 2020.)
Is WCD a genuine security concern? Absolutely. That point was already made repeatedly over the past years. In this column, I will focus on a largely overlooked question: Given the severity of the issue, why aren't we seeing researchers scrambling to propose defenses? Why aren't security vendors flooding the market with solutions?
Unfortunately, this is a direct consequence of the fact that web caches are easy to exploit but disproportionately difficult to secure. Let's dig deeper into how the attack works to understand why.
WCD stems from a discrepancy between how a cache and an origin server interpret a given HTTP request. For instance, an attacker can craft a URL that points to the account information on a banking website but append to it a nonexistent path component disguised as a static image, such as "/account.php/nonexistent.jpg." Many origin servers will simply ignore the invalid suffix and respond with account details. However, a web cache proxying the content will be oblivious to the processing that happens on the origin server, and it will store the response as if it were an image. If the attacker can trick a user into clicking on this link, the victim's account information will be cached, giving the attacker an opportunity to steal it.
The key observation here is that neither the origin server nor the web cache is individually at fault. In fact, when we examine them in isolation, they are both perfectly secure, performing their intended functions. Instead, the vulnerability results from different interpretations of the same request by two technologies that process the traffic, leading to a disagreement on the "cacheability" of the response. Perhaps a safety model — as opposed to a more traditional security model — is more appropriate when analyzing WCD: Faulty components don't lead to a vulnerability; instead, hazardous interactions between components lead to an accident.
This has dire implications for security professionals. Fixing a WCD vulnerability is very different from applying a patch to a broken software; it requires site operators to adopt a holistic view of their web infrastructure. Operators need to identify all technologies that may influence the Internet traffic traversing their environment, understand how they individually work, and how they influence each other, just to pinpoint a vulnerability. The fix may then require intrusive architectural changes.
This is already nontrivial for a small web deployment but large enterprises often span global infrastructures, utilize a patchwork of centralized caches, and chain together multiple CDN providers. The task quickly becomes intractable.
Attackers, on the other hand, do not need to concern themselves with this complexity. They do not need to understand why a vulnerability exists but merely test their exploits treating their target as a black box. Sweeping through a large array of sites looking for vulnerabilities is straightforward, whereas fixing a single vulnerability requires considerable effort.
With the flood of new attack variants, exciting offensive research opportunities, and the media's focus on exploited sites, it's easy to overlook the fundamental challenges for an effective WCD defense. Cache attacks will likely get worse before they get better, and we don't yet have a good solution. Automatic discovery of hazardous interactions in a web architecture is an open research problem.
In the meantime, falling back on asset management best practices is a good bet. A well-maintained system register that describes entities and the relationships between them goes a long way in helping site operators track down potential WCD vulnerabilities. Perhaps most important, though, is realizing that a systems safety problem like WCD can't possibly be addressed by system owners, cache vendors, or CDN providers on their own. Systems-centric security and safety analyses of Internet-wide infrastructures require the collaboration of all parties involved.
Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "The Y2K Boomerang: InfoSec Lessons Learned from a New Date-Fix Problem."