Software testing is notoriously hard. Search Google for CVEs caused by basic CRLF (newline character) issues and you'll see thousands of entries. Humanity has put a man on the moon, but we still haven't found a proper way to handle line endings in text files. It's those subtle corner cases that programmers tend to overlook.
The inherent complexity of Log4j brings this challenge to a new level. The Log4Shell vulnerability (CVE-2021-44228) has been around since 2013 without getting noticed until it was exploited as a zero-day eight years later. What tools could have been used to discover the Log4Shell vulnerability before it shook the industry? Is it realistic to expect automated detection of such security vulnerabilities when they are still unknown? If so, then how come a heavily tested module like Log4j escaped all lines of defense?
The Zero-Day Vulnerability Detection Challenge
Hindsight is 20/20. It is painfully clear now that logging API functions will eventually receive user-controlled data and that JNDI injections present a major concern, but prior to the discovery of Log4Shell, both facts were not as obvious. For the former fact, the separation of responsibility for validating inputs between the application and the libraries is not well defined, and the precise definition of attack surface is unclear. For the latter, although JNDI injections were known for a few years, the awareness of potential severity was lacking.
Therefore, probably the biggest obstacle to detecting CVE-2021-44228 or similar vulnerabilities before they're exploited is the vagueness of the threat parameters. There is no absolute and complete definition of values that can or cannot be trusted, nor of operations that may or may not be performed on them. In practice, any threat analysis will depend on the interpretation of these gray areas.
When the requirements are vague, developers are at a disadvantage. A security-oriented code review to check whether there is "anything wrong" with the behavior of a large software module is overwhelming. The reviewers tend to look for local mistakes, since grasping the relationships between different parts of a program separated by tens of function calls is not feasible. In sharp contrast, for an attacker looking for a specific vulnerability suspected to be present in the code to exploit, the task is much more manageable.
Automated Zero-Day Vulnerability Detection — Can It Work?
The dynamic analysis technique called fuzzing identifies unknown vulnerabilities by executing a program on random (or pseudo-random) inputs and looking for instances where it either crashes or violates some assertions. Would fuzzing have helped in the case of Log4j? Probably not.
Fuzzing usually involves looking for crashes, which indicate memory corruption. In the context of Java, which is memory safe, crashes will usually not have severe security implications.
For meaningful fuzzing, you need custom hooks on specific logical conditions that indicate problematic behavior. In addition, you need to build a fuzzing harness to provide the input (to Log4j API functions in our case). Both constructions may require some manual effort. After this set-up, fuzzing can be used to detect this bug, or even similar bugs in other software if the required hooks and harness are similar enough. However, as in the case of manual code review, the vagueness of requirements and manual effort associated with trying out different assumptions would most likely lead to Log4j being missed in fuzz testing as well.
This leads us to our final technique to try, static analysis, which inspects the program's possible behaviors without actually executing the application. A specific interesting form of static analysis is data flow analysis — tracing possible paths of data in the program, from data sources to data sinks. In this case, the existence of a data path starting at a Log4j API functions argument and reaching JNDI lookup indicates an exploitable vulnerability.
A modern inter-procedural static analyzer would face a few problems when analyzing log4j with or without a set of predetermined sinks.
Zero-Day Detection Through Static Analysis — A Deeper Look
In the below code snippet, taken from log4j-2.14.1-core, the LogEvent object (e) contains in one of its fields the user-controlled string. This is a signal to the static analyzer that e should be further tracked. There are more than 20 implementations to the Appender interface called in appender.append(e). How does the static analyzer know for sure which implementation is used there?
It doesn't! It can't. According to the halting problem, statically determining which code path will actually be taken during runtime is literally not doable.
So what can the analyzer do? It can overapproximate. Whenever dangerous code paths might exist in your code, your code will be declared "dangerous." And just like keeping your front door open overnight doesn't guarantee that someone will steal your smart TV, the static analyzer will say something equivalent to "better just lock your front door."
The inherent over-approximation of static analyzers is what allows them to scale to real-life code. Think of loops. Dynamic methods will inevitably explore the consequent iterations of loops separately. This literally means that the number of states to cover is infinite. In a nutshell, static analyzers will summarize the effect of a loop by considering the effects of 0,1,2, … iterations combined.
The major drawback of static analyzers is lack of precision — in other words, false positives. Since static analyzers combine the effects of many code paths together, including non-feasible ones, users are faced with countless spurious code paths, in contrast to dynamic analysis methods, which can pinpoint the exact location of the bug.
A Shift for Interactivity
More companies are using security-oriented static analysis in their development cycle. Static analysis tools increasingly provide developers guidance as they code through IDE integration. However, due to the challenge described above, most existing tools will "refuse" to perform data path analyses without proper sink definitions. Even with an explicit set of predefined sinks, there are countless false positives — so imagine what would happen without them.
The industry should move toward a more "interactive" type of static analyzer, a tool that gives developers information on potential risks originating from user inputs while they code. This flexible approach could be a game-changer in zero-day detection.
While the definition of user-controlled entry points can often be achieved with a program's API, defining hazardous code sinks that use user-controlled input is trickier. For zero-day detection, this is even more true. Developers do not always know what security warning signals to look for, but an automatic companion that puts its virtual finger on user-controlled input may help. To achieve this, we advocate interactive IDE support in the form of "shift-left" static analysis plugins.
Data flow static analysis enables developers to identify vulnerabilities involving manipulated user inputs, like Log4Shell, early on. It is currently an active area of security research, but bringing the technology to widespread use in the industry should be everyone's target.