Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


12:15 PM
Adrian Lane
Adrian Lane

Data Masking Primer

Data masking is an approach to data security used to conceal sensitive information. Unlike encryption, which renders data unusable until it is restored to clear text, masking is designed to protect data while retaining business functionality.

Data masking is an approach to data security used to conceal sensitive information. Unlike encryption, which renders data unusable until it is restored to clear text, masking is designed to protect data while retaining business functionality.Masking is most commonly used with relational databases, maintaining the complex data relationships that database applications rely on. Masking, in essence, scrambles data in such a way as to render individual data meaningless, but still provides business use and database functional dependencies. One example: shuffling patient care data so that individual data points cannot be traced to one person, but medical trend data can still be derived from the database as a whole.

The two most common business use cases for masking are testing and analytics. Using real customer data is the best way to confirm application functionality, but moving sensitive production data (patient records, financial transactions, customer history) into lower security test systems is very risky. Similarly, so is moving sensitive data into business analytics and decision-support systems, with correspondingly greater exposure to loss. Masking provides test applications and business analytics with valuable data and simultaneously secure sensitive information.

"Data masking" is the industry accepted term for this market segment. Masking implies concealment, but not alterations; most data masking products alter the original copy. There are many other ways to scramble data, including transposition, substitution, obfuscation, concatenation, statistical averaging, and hashing algorithms (just to name a few). These technologies transform information into something that looks like the original, but with the original copy obliterated, and the new data cannot be reverse-engineered.

Data masking is commonly employed using three basic strategies:

1. ETL (Extract, Transform and Load): This describes the process most commonly associated with data masking. As data is queried or archived from the database, it is run through a transformational algorithm and then reloaded into a test or decision-support database. The original production database remains intact, but the copies have been transformed into a safe state.

2. Dynamic In Place Masking: This is a new catchphrase for the masking market and, unlike ETL, does not create a new copy. Dynamic masking keeps the original data, but creates a transformation "mask" dynamically, as queries are received. Implemented as a database "view" or trigger, query results are transformed before returned to the user. Depending on users' credentials, they may get unaltered data or masked data. This allows masking to be run in parallel to the original data set, using the same database installation, but it comes at some cost in performance.

3. Static In Place Masking: In this model, original data within the database undergoes obfuscation in place. The vendors provide the capability to make the changes without breaking data relationships. This model allows for complex, multitransformational algorithms to be applied simultaneously to keep obfuscated data value close to the original. There is no performance degradation or additional space requirements, but it requires periodic checking to mask new data entries.

Adrian Lane is an analyst/CTO with Securosis LLC, an independent security consulting practice. Special to Dark Reading. Adrian Lane is a Security Strategist and brings over 25 years of industry experience to the Securosis team, much of it at the executive level. Adrian specializes in database security, data security, and secure software development. With experience at Ingres, Oracle, and ... View Full Bio

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Inside the Ransomware Campaigns Targeting Exchange Servers
Kelly Sheridan, Staff Editor, Dark Reading,  4/2/2021
Beyond MITRE ATT&CK: The Case for a New Cyber Kill Chain
Rik Turner, Principal Analyst, Infrastructure Solutions, Omdia,  3/30/2021
Register for Dark Reading Newsletters
White Papers
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you today!
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2021-04-15
Wordpress is an open source CMS. One of the blocks in the WordPress editor can be exploited in a way that exposes password-protected posts and pages. This requires at least contributor privileges. This has been patched in WordPress 5.7.1, along with the older affected versions via minor releases. It...
PUBLISHED: 2021-04-15
Lotus is an Implementation of the Filecoin protocol written in Go. BLS signature validation in lotus uses blst library method VerifyCompressed. This method accepts signatures in 2 forms: "serialized", and "compressed", meaning that BLS signatures can be provided as either of 2 un...
PUBLISHED: 2021-04-15
Sydent is a reference Matrix identity server. Sydent does not limit the size of requests it receives from HTTP clients. A malicious user could send an HTTP request with a very large body, leading to memory exhaustion and denial of service. Sydent also does not limit response size for requests it mak...
PUBLISHED: 2021-04-15
Sydent is a reference Matrix identity server. Sydent can be induced to send HTTP GET requests to internal systems, due to lack of parameter validation or IP address blacklisting. It is not possible to exfiltrate data or control request headers, but it might be possible to use the attack to perform a...
PUBLISHED: 2021-04-15
Sydent is a reference matrix identity server. A malicious user could abuse Sydent to send out arbitrary emails from the Sydent email address. This could be used to construct plausible phishing emails, for example. This issue has been fixed in 4469d1d.