Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Application Security

12:00 PM
Jeff Williams
Jeff Williams
Connect Directly
E-Mail vvv

The Only 2 Things Every Developer Needs To Know About Injection

There's no simple solution for preventing injection attacks. There are effective strategies that can stop them in their tracks.

Security is pretty easy, right? If there’s a threat, we put in a defense. Sometimes we can centralize these defenses. For example, you might use an authentication gateway to restrict access to your web applications and web services. Unfortunately, the defenses for "injection” attacks don’t centralize so well, which has made them one of the most popular attack vectors.

Injection: mainlining attacks into your code
Conceptually, injection is simple. It happens any time your code includes untrusted data in a command that is sent to an interpreter. For example, one very popular injection attack can be performed against database interpreters. This attack, known as "SQL Injection" was discovered in 1998 and still pops up in the news every few weeks. SQL Injection happens whenever a developer includes untrusted data in a database query. For example, a developer might take your username and password from a browser request and build a query like this:

The attacker could send in an attack right from his or her browser, with URL parameters designed to modify the meaning of the query. In the example, the query is modified to return every user in the database. In some cases, the attacker could use this attack to steal the entire database or even "own" the database machine.

There are many other varieties including: Command Injection, LDAP Injection, Expression Language Injection, and even Cross-Site Scripting. Almost any component or interface with a command interface is potentially susceptible. Unfortunately, every type of injection has its own unique characteristics, which makes it very difficult to defend against.

Untrusted data and data flow
All these injection attacks come from untrusted data. What data is untrusted? Here’s a simple rule: If you aren’t certain that it doesn’t contain attacks, then it’s untrusted. All the data from the browser, including URL parameters, form fields, headers, and cookies are all untrusted. But so are other sources like flat files, web services, databases, etc… Even internal sources of data can (and probably should) be considered untrusted.

Untrusted data hits an application like a cluster bomb. As this data passes through the millions of lines of application code, libraries, frameworks, and runtime, it gets parsed, copied, split, merged, transformed, assembled, stored, and retrieved. And every copy that is created is a potential injection vector. It can be extremely difficult for both humans and tools to trace all these data flows, which is why many injection flaws get overlooked.

Critical injection defense strategy No. 1: Only process validated data
So, what defense can we drop in to stop injection attacks? Unfortunately, there’s no simple answer. Still, there are two defense strategies that can guide us to prevent any injection flaw.

Most untrusted data comes in the form of a "string" without any restrictions on the size, characters, format, or pattern. Strings are like FedEx One Rate packages for attackers. Even if the developer is trying to ship a ZIP code, temperature, date, or phone number, an attacker can put in whatever he wants and it gets shipped right through your application without inspection.

If you want to validate to prevent injection, you really have to know a lot about the particular interpreter that you are passing data to. For example, if your application is sending untrusted XML to an XML parser, you better know all the details about doctypes, DTDs, and external entities. Almost every interpreter has extensive corner cases and opportunities for an attacker to cause your application to do unexpected things.

A better approach is to parse and validate the data against a specific pattern for what you expect. This is called “positive validation.” In a typical web application or web service with thousands of inputs, this isn’t easy. You’ll need some support from your framework or at least a common validation library so that your validation is consistent. Your mission: Validate all that untrusted input.

But what if your application requires the use of special characters like single-quote, double-quote, hyphen, etc.? Those are exactly the characters that are significant to parsers. So, despite what you might read online, validation shouldn’t be your only defense against injection.

Critical injection defense strategy No. 2: Keep code and data separate 
Every CS 101 professor tells students to keep their code and data separate, but that’s easier said than done. What we need is a way to keep the data from getting mixed up with the commands.

Some interpreters provide exactly this sort of interface, called a "parameterized" API. Think of a MadLibs™ game where you fill in the blanks with a particular type of word like verb, adjective, funny bodily sound, etc. The cool thing about MadLibs is that nothing you enter in the blanks can change the template. So, to prevent SQL Injection, you can create a query template, fill in the “parameters” using the API, and then submit the query. If you avoid APIs that take a command as one big string, you can stop injection cold! For example, a parameterized SQL query in Java uses a question mark for the blanks and looks like this:

Unfortunately, not all interpreters have a parameterized API. So how can we keep the code and data separate? We’re going to have to use "escaping" or "encoding." Escaping involves the use of an "escape" character to tell the interpreter that a special character is coming and you should treat it like data and not a command (for example, in JavaScript a quote character). Encoding is similar, except that the special character is translated into a new series of characters, like in a URL using %22 to encode a quote character.

These techniques prevent untrusted data from affecting the meaning of a command. The problem is that the interpreter has its own syntax for escaping and encoding. For example, HTML is probably the worst mashup of code and data of all time, and includes at least five completely different schemes: HTML entity encoding, URL encoding, CSS escaping, JavaScript escaping, and VBScript escaping.

Get ready, developers: You’re going to need to know exactly what characters need to be escaped or encoded and how to do it. There are some libraries available to help you with this, such as the OWASP ESAPI encoders.

Why do I have to both validate and keep data separate?
There are two reasons you need to do both validation and separation. The first is basic defense in depth. Both validation and separation are difficult to get right in all the places they need to be. Doing both helps to minimize gaps and improve your odds of defending attacks.

The second reason is more subtle. Even if your application is totally protected against injection, you should still do validation. Why? Because validation is the only way to detect attacks on your application. Despite the plethora of products on the market claiming to detect application layer attacks, it’s not possible from outside your code. Every application is a beautiful and unique snowflake, so the same string could be completely safe for one and be a complete host takeover for another. Only the application can figure out what input is an attack and what is safe.

If you think of input validation as a form of intrusion detection, you’ll end up a lot safer. Stopping obvious attacks might be the single most effective thing you could do to protect your application, at least from automated attacks. Your validation should strive to put the data into three different buckets:

If the data exactly matches what you expect, then you can proceed with the data. Don’t forget to use parameterized APIs, encoding, or escaping if you use this data with an interpreter.

If the data is questionable, it might be data that was inadvertently cut-and-pasted into an application, an accidental mistype, or a possible attack. In this case, you want to help your user out and encourage her to submit valid data. But it’s worth keeping an eye on it with logging and periodic analysis.

If the data is clearly an attack, then take action! Don’t just log it and continue. You should probably log out that user and warn her that her account has been compromised (or you could just accuse her of being a hacker). I usually reserve this category for data that could not possibly have been generated by a legitimate user of the application -- for example, a hidden field or pull-down menu value that doesn’t match what was sent to the client. Or, if you have strong client-side validation, you might consider treating any data that doesn’t validate on the server as an attack.

We’ve known about injection attacks for well over a decade. If we keep these two simple strategies in mind, we can stamp out injection and make our software a lot more trustworthy. Let me know in the comments how you handle injection! Good luck.

A pioneer in application security, Jeff Williams is the founder and CTO of Contrast Security, a revolutionary application security product that enhances software with the power to defend itself, check itself for vulnerabilities, and join a security command and control ... View Full Bio
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Apprentice
5/23/2014 | 12:29:50 PM
Responsible programming
Excellent article! In your example of injection, the encapsulation would have to be prepared in such a way as to return multiple data. I am not sure most api would work this way, but it is certainly something to be aware of as sometimes developers do use such shortcuts and multi-purpose code.

When data is coming from a public source I believe that validation must necessarily occur using multiple factors, both for authentication and sanity checks. Perhaps there is more trust within a corporate shield, but these days this is rarely the case. Take for example a (trusted) employee using a BYOD device on corporate wi-fi, even with MAC validation.
User Rank: Ninja
5/23/2014 | 4:35:43 AM
Data Validation
It's all about the data validation.  OWASP has great documents on this, including: https://www.owasp.org/index.php/Data_Validation and https://www.owasp.org/index.php/Interpreter_Injection

In a nutshell:
  • Integrity checks - Ensure that the data has not been tampered with and is the same as before (checksums, logic in test code, etc)
  • Validation - Ensure that the data is strongly typed, correct syntax, within length boundaries, contains only permitted characters, or that numbers are correctly signed and within range boundaries
  • Business rules - Ensure that data is not only validated, but business rule correct. For example, interest rates fall within permitted boundaries. (extensive test scripts based upon code requirements)
Charlie Babcock
Charlie Babcock,
User Rank: Ninja
5/22/2014 | 11:20:09 PM
Succinct anti-injection advice
Excellent, succinct advice on how to combat the injection attack. It requires more than most programmers are prepared to understand and do, but  there must be some way for greater programmer awareness and framework automated assistance to reduce the frequency of these attacks.
Ransomware Is Not the Problem
Adam Shostack, Consultant, Entrepreneur, Technologist, Game Designer,  6/9/2021
How Can I Test the Security of My Home-Office Employees' Routers?
John Bock, Senior Research Scientist,  6/7/2021
New Ransomware Group Claiming Connection to REvil Gang Surfaces
Jai Vijayan, Contributing Writer,  6/10/2021
Register for Dark Reading Newsletters
White Papers
Current Issue
The State of Cybersecurity Incident Response
In this report learn how enterprises are building their incident response teams and processes, how they research potential compromises, how they respond to new breaches, and what tools and processes they use to remediate problems and improve their cyber defenses for the future.
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2021-06-21
In memory management driver, there is a possible memory corruption due to a race condition. This could lead to local escalation of privilege with no additional execution privileges needed. User interaction is not needed for exploitation.Product: AndroidVersions: Android SoCAndroid ID: A-185196177
PUBLISHED: 2021-06-21
In memory management driver, there is a possible memory corruption due to a race condition. This could lead to local escalation of privilege with no additional execution privileges needed. User interaction is not needed for exploitation.Product: AndroidVersions: Android SoCAndroid ID: A-185193932
PUBLISHED: 2021-06-21
Apache Nuttx Versions prior to 10.1.0 are vulnerable to integer wrap-around in functions malloc, realloc and memalign. This improper memory assignment can lead to arbitrary memory allocation, resulting in unexpected behavior such as a crash or a remote code injection/execution.
PUBLISHED: 2021-06-21
In updateDrawable of StatusBarIconView.java, there is a possible permission bypass due to an uncaught exception. This could lead to local escalation of privilege by running foreground services without notifying the user, with User execution privileges needed. User interaction is not needed for explo...
PUBLISHED: 2021-06-21
In avrc_pars_browse_rsp of avrc_pars_ct.cc, there is a possible out of bounds read due to a missing bounds check. This could lead to remote information disclosure over Bluetooth with no additional execution privileges needed. User interaction is not needed for exploitation.Product: AndroidVersions: ...