There is no software without bugs, right? While this is a common sentiment, we make assumptions that rely on the premise that software has no bugs in our day-to-day digital life. We trust identity providers (IDPs) to get authentication right, operating systems to perfectly comply with their specs, and financial transactions to always perform as intended. Even more vividly, we trust software with our physical safety by going on planes, driving a car that actively corrects our adherence to traffic lanes, or undergoing certain surgeries. What makes this possible? Or to put it another way, why aren't planes falling out of the sky due to bad software?
Software quality assurance borrows from scientific and engineering disciplines. Akin to reproducibility in science, one way to ensure and improve software quality is to publicize it and give as many people as possible an incentive to try to break it.
Another is using design patterns or well-architecture frameworks rooted in engineering. For example, while not every software project can be put under the same level of scrutiny as the Linux kernel, which has been under scrutiny for decades, software projects can leverage proven design patterns borrowed from the Linux kernel in hopes of gaining some of the security guarantees.
And of course, there's testing. Whether static, dynamic, or real-time, done by the developer or by a dedicated team, testing is a major part of software development. With critical software, testing is usually an entirely separate project handled by a separate team with specific expertise.
All these methods are good, but they don't claim to be comprehensive, and history suggests they aren't. There are no guarantees we found all the bugs because we don't know which bugs we don't know about (i.e., false negatives). Did we already find 99% of Linux kernel bugs out there? 50%? 10%?
The 'Absolute' Claim
The research field of formal methods is looking at ways to assure you that there are no bugs in a certain piece of software, such as your stockbroker or certificate authority. The basic idea is to translate software into math, where everything is well-defined, and then create an actual proof that the software works with no bugs. That way, you can be sure that your software is bug-free in the same way you can be sure that every number can be decomposed to a multiplication of prime numbers. (Note that I don't define what a bug is. This will prove to be a problem, as we will later see.)
Formal method techniques have long been used for critical software, but they were extremely compute- and effort-intensive and so applied only to small pieces of software, such as a limited part of chip firmware or an authentication protocol. In recent years, advanced theorem provers like Z3 and Coq have made it possible to apply this technology in a larger context. There are now formally verified programming languages, operating systems, and compilers that are 100% guaranteed to work according to their specifications. Applying these technologies still requires both advanced expertise and a ton of computing power, which make them prohibitively expensive to most organizations.
Major cloud providers are performing formal verification of their fundamental stacks to reach high levels of security assurance. Amazon and Microsoft have dedicated research groups that work with engineering teams to incorporate formal verification methods into critical infrastructure, such as AWS S3 and EBS and Azure Blockchain. But the really interesting fact is that in the past few years, cloud providers have been trying to commoditize formal verification to sell to their customers.
Decisively Solving Misconfiguration?
Last year, AWS released two features that leverage formal verification to address issues that have long plagued their customers, namely network and identity and access management (IAM) misconfigurations. Network access and IAM configurations are complex, and that complexity grows drastically in a large organization with distributed decision-making and governance. AWS addresses it by giving its customers simple controls — such as "S3 buckets should not be exposed to the Internet" or "Internet traffic to EC2 instances must go through a firewall" — and guaranteeing to apply them in every possible configuration scenario.
AWS is not the first to address the misconfiguration problem, even for AWS-specific issues such as open S3 buckets. Cloud security posture management (CSPM) vendors have been addressing this issue for a while now, analyzing AWS virtual port channel (VPC) configuration and IAM roles and identifying cases where privileges are too lax, security features are not properly used, and data can be exposed to the Internet. So what's new?
Well, this is where the absolute guarantee comes in. A CSPM solution works by creating a known-bad or known-good list of misconfigurations, sometimes adding context from your environment, and producing results accordingly. Network and IAM analyzers work by examining every potential IAM or network request and guaranteeing that they will not result in unwanted access according to your specification (such as "no Internet access"). The difference is in the guarantees about false negatives.
While AWS claims that there is no way it has missed anything, CSPM vendors say they are always on the lookout for new misconfigurations to catalog and detect, which is an admission that they did not detect these misconfigurations previously.
What Is a Bug, Anyway?
Formal verification is great for finding well-defined issues, such as memory security issues. However, things become difficult when trying to find logical bugs because those require specifying what the code is actually supposed to do, which is exactly what the code itself does.
For one thing, formal verification requires specifying well-defined goals. While some goals, like preventing access to the Internet, seem simple enough, in reality they are not. The AWS IAM analyzer documentation has an entire section defining what "public" means, and it's full of caveats. The guarantees it provides are only as good as the mathematical claims that it has coded.
There's also the question of coverage. AWS analyzers cover only a few major AWS services. If you route traffic into your network through an outbound connection channel, or grant a service access to two separate IAM roles which can be combined to write confidential data to a public bucket, the analyzer wouldn't know. Nevertheless, on some well-defined subset of the misconfiguration problem, formal verification provides stronger guarantees than ever before.
Getting back to the relative advantage question posed above, the difference is that the IAM and network analyzer claims that its list of issues detected is comprehensive, while CSPM claims that its list covers every misconfiguration known today. Here's the key question: Should you care?
Should We Care About Absolute Guarantees?
Consider the following scenario. You own a CSPM and look at the AWS network and IAM analyzer. Comparing the results of the two, you realize that they've identified the exact same problems. After some effort, you fix every single problem on that list. Relying only on your CSPM, you would feel you are in a good place now and could dedicate security resources elsewhere. By adding AWS analyzers to the mix, you now know — with an AWS guarantee — that you are in a good place. Are these the same results?
Even if we neglect the caveat of formal verification and assume that it catches 100% of issues, measuring the benefits over detection-based services like CSPM would be an exercise for every individual organization with its own security risk appetite. Some would find these absolute guarantees groundbreaking, while others would probably stick to existing controls.
These questions are not unique to CSPM. The same comparisons could be made for SAST/DAST/IAST web application security testing tools and formally verified software, to name one example.
Regardless of individual organization choices, one exciting side effect of this new technology would be an independent way to start measuring security solutions' false negative rates, pushing vendors to be better and providing them with clear evidence where they need to improve. This in and of itself is a tremendous contribution to the cybersecurity industry.