Application Security

12:00 PM
Jeff Williams
Jeff Williams
Connect Directly
E-Mail vvv

The Only 2 Things Every Developer Needs To Know About Injection

There's no simple solution for preventing injection attacks. There are effective strategies that can stop them in their tracks.

Security is pretty easy, right? If there’s a threat, we put in a defense. Sometimes we can centralize these defenses. For example, you might use an authentication gateway to restrict access to your web applications and web services. Unfortunately, the defenses for "injection” attacks don’t centralize so well, which has made them one of the most popular attack vectors.

Injection: mainlining attacks into your code
Conceptually, injection is simple. It happens any time your code includes untrusted data in a command that is sent to an interpreter. For example, one very popular injection attack can be performed against database interpreters. This attack, known as "SQL Injection" was discovered in 1998 and still pops up in the news every few weeks. SQL Injection happens whenever a developer includes untrusted data in a database query. For example, a developer might take your username and password from a browser request and build a query like this:

The attacker could send in an attack right from his or her browser, with URL parameters designed to modify the meaning of the query. In the example, the query is modified to return every user in the database. In some cases, the attacker could use this attack to steal the entire database or even "own" the database machine.

There are many other varieties including: Command Injection, LDAP Injection, Expression Language Injection, and even Cross-Site Scripting. Almost any component or interface with a command interface is potentially susceptible. Unfortunately, every type of injection has its own unique characteristics, which makes it very difficult to defend against.

Untrusted data and data flow
All these injection attacks come from untrusted data. What data is untrusted? Here’s a simple rule: If you aren’t certain that it doesn’t contain attacks, then it’s untrusted. All the data from the browser, including URL parameters, form fields, headers, and cookies are all untrusted. But so are other sources like flat files, web services, databases, etc… Even internal sources of data can (and probably should) be considered untrusted.

Untrusted data hits an application like a cluster bomb. As this data passes through the millions of lines of application code, libraries, frameworks, and runtime, it gets parsed, copied, split, merged, transformed, assembled, stored, and retrieved. And every copy that is created is a potential injection vector. It can be extremely difficult for both humans and tools to trace all these data flows, which is why many injection flaws get overlooked.

Critical injection defense strategy No. 1: Only process validated data
So, what defense can we drop in to stop injection attacks? Unfortunately, there’s no simple answer. Still, there are two defense strategies that can guide us to prevent any injection flaw.

Most untrusted data comes in the form of a "string" without any restrictions on the size, characters, format, or pattern. Strings are like FedEx One Rate packages for attackers. Even if the developer is trying to ship a ZIP code, temperature, date, or phone number, an attacker can put in whatever he wants and it gets shipped right through your application without inspection.

If you want to validate to prevent injection, you really have to know a lot about the particular interpreter that you are passing data to. For example, if your application is sending untrusted XML to an XML parser, you better know all the details about doctypes, DTDs, and external entities. Almost every interpreter has extensive corner cases and opportunities for an attacker to cause your application to do unexpected things.

A better approach is to parse and validate the data against a specific pattern for what you expect. This is called “positive validation.” In a typical web application or web service with thousands of inputs, this isn’t easy. You’ll need some support from your framework or at least a common validation library so that your validation is consistent. Your mission: Validate all that untrusted input.

But what if your application requires the use of special characters like single-quote, double-quote, hyphen, etc.? Those are exactly the characters that are significant to parsers. So, despite what you might read online, validation shouldn’t be your only defense against injection.

Critical injection defense strategy No. 2: Keep code and data separate 
Every CS 101 professor tells students to keep their code and data separate, but that’s easier said than done. What we need is a way to keep the data from getting mixed up with the commands.

Some interpreters provide exactly this sort of interface, called a "parameterized" API. Think of a MadLibs™ game where you fill in the blanks with a particular type of word like verb, adjective, funny bodily sound, etc. The cool thing about MadLibs is that nothing you enter in the blanks can change the template. So, to prevent SQL Injection, you can create a query template, fill in the “parameters” using the API, and then submit the query. If you avoid APIs that take a command as one big string, you can stop injection cold! For example, a parameterized SQL query in Java uses a question mark for the blanks and looks like this:

Unfortunately, not all interpreters have a parameterized API. So how can we keep the code and data separate? We’re going to have to use "escaping" or "encoding." Escaping involves the use of an "escape" character to tell the interpreter that a special character is coming and you should treat it like data and not a command (for example, in JavaScript a quote character). Encoding is similar, except that the special character is translated into a new series of characters, like in a URL using %22 to encode a quote character.

These techniques prevent untrusted data from affecting the meaning of a command. The problem is that the interpreter has its own syntax for escaping and encoding. For example, HTML is probably the worst mashup of code and data of all time, and includes at least five completely different schemes: HTML entity encoding, URL encoding, CSS escaping, JavaScript escaping, and VBScript escaping.

Get ready, developers: You’re going to need to know exactly what characters need to be escaped or encoded and how to do it. There are some libraries available to help you with this, such as the OWASP ESAPI encoders.

Why do I have to both validate and keep data separate?
There are two reasons you need to do both validation and separation. The first is basic defense in depth. Both validation and separation are difficult to get right in all the places they need to be. Doing both helps to minimize gaps and improve your odds of defending attacks.

The second reason is more subtle. Even if your application is totally protected against injection, you should still do validation. Why? Because validation is the only way to detect attacks on your application. Despite the plethora of products on the market claiming to detect application layer attacks, it’s not possible from outside your code. Every application is a beautiful and unique snowflake, so the same string could be completely safe for one and be a complete host takeover for another. Only the application can figure out what input is an attack and what is safe.

If you think of input validation as a form of intrusion detection, you’ll end up a lot safer. Stopping obvious attacks might be the single most effective thing you could do to protect your application, at least from automated attacks. Your validation should strive to put the data into three different buckets:

If the data exactly matches what you expect, then you can proceed with the data. Don’t forget to use parameterized APIs, encoding, or escaping if you use this data with an interpreter.

If the data is questionable, it might be data that was inadvertently cut-and-pasted into an application, an accidental mistype, or a possible attack. In this case, you want to help your user out and encourage her to submit valid data. But it’s worth keeping an eye on it with logging and periodic analysis.

If the data is clearly an attack, then take action! Don’t just log it and continue. You should probably log out that user and warn her that her account has been compromised (or you could just accuse her of being a hacker). I usually reserve this category for data that could not possibly have been generated by a legitimate user of the application -- for example, a hidden field or pull-down menu value that doesn’t match what was sent to the client. Or, if you have strong client-side validation, you might consider treating any data that doesn’t validate on the server as an attack.

We’ve known about injection attacks for well over a decade. If we keep these two simple strategies in mind, we can stamp out injection and make our software a lot more trustworthy. Let me know in the comments how you handle injection! Good luck.

A pioneer in application security, Jeff Williams is the founder and CTO of Contrast Security, a revolutionary application security product that enhances software with the power to defend itself, check itself for vulnerabilities, and join a security command and control ... View Full Bio
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Apprentice
5/23/2014 | 12:29:50 PM
Responsible programming
Excellent article! In your example of injection, the encapsulation would have to be prepared in such a way as to return multiple data. I am not sure most api would work this way, but it is certainly something to be aware of as sometimes developers do use such shortcuts and multi-purpose code.

When data is coming from a public source I believe that validation must necessarily occur using multiple factors, both for authentication and sanity checks. Perhaps there is more trust within a corporate shield, but these days this is rarely the case. Take for example a (trusted) employee using a BYOD device on corporate wi-fi, even with MAC validation.
Christian Bryant
Christian Bryant,
User Rank: Ninja
5/23/2014 | 4:35:43 AM
Data Validation
It's all about the data validation.  OWASP has great documents on this, including: and

In a nutshell:
  • Integrity checks - Ensure that the data has not been tampered with and is the same as before (checksums, logic in test code, etc)
  • Validation - Ensure that the data is strongly typed, correct syntax, within length boundaries, contains only permitted characters, or that numbers are correctly signed and within range boundaries
  • Business rules - Ensure that data is not only validated, but business rule correct. For example, interest rates fall within permitted boundaries. (extensive test scripts based upon code requirements)
Charlie Babcock
Charlie Babcock,
User Rank: Ninja
5/22/2014 | 11:20:09 PM
Succinct anti-injection advice
Excellent, succinct advice on how to combat the injection attack. It requires more than most programmers are prepared to understand and do, but  there must be some way for greater programmer awareness and framework automated assistance to reduce the frequency of these attacks.
Facebook Aims to Make Security More Social
Kelly Sheridan, Associate Editor, Dark Reading,  2/20/2018
SEC: Companies Must Disclose More Info on Cybersecurity Attacks & Risks
Kelly Jackson Higgins, Executive Editor at Dark Reading,  2/22/2018
Register for Dark Reading Newsletters
White Papers
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: This comment is waiting for review by our moderators.
Current Issue
How to Cope with the IT Security Skills Shortage
Most enterprises don't have all the in-house skills they need to meet the rising threat from online attackers. Here are some tips on ways to beat the shortage.
Flash Poll
[Strategic Security Report] Navigating the Threat Intelligence Maze
[Strategic Security Report] Navigating the Threat Intelligence Maze
Most enterprises are using threat intel services, but many are still figuring out how to use the data they're collecting. In this Dark Reading survey we give you a look at what they're doing today - and where they hope to go.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
Published: 2017-05-09
NScript in mpengine in Microsoft Malware Protection Engine with Engine Version before 1.1.13704.0, as used in Windows Defender and other products, allows remote attackers to execute arbitrary code or cause a denial of service (type confusion and application crash) via crafted JavaScript code within ...

Published: 2017-05-08
unixsocket.c in lxterminal through 0.3.0 insecurely uses /tmp for a socket file, allowing a local user to cause a denial of service (preventing terminal launch), or possibly have other impact (bypassing terminal access control).

Published: 2017-05-08
A privilege escalation vulnerability in Brocade Fibre Channel SAN products running Brocade Fabric OS (FOS) releases earlier than v7.4.1d and v8.0.1b could allow an authenticated attacker to elevate the privileges of user accounts accessing the system via command line interface. With affected version...

Published: 2017-05-08
Improper checks for unusual or exceptional conditions in Brocade NetIron 05.8.00 and later releases up to and including 06.1.00, when the Management Module is continuously scanned on port 22, may allow attackers to cause a denial of service (crash and reload) of the management module.

Published: 2017-05-08
Nextcloud Server before 11.0.3 is vulnerable to an inadequate escaping leading to a XSS vulnerability in the search module. To be exploitable a user has to write or paste malicious content into the search dialogue.