Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Attacks/Breaches

5/20/2011
01:03 PM
50%
50%

Tech Insight: Finding And Securing Your Enterprise's Most Sensitive Data

The headlines are full of companies facing serious breaches. Here are some basic steps to protect your enterprise's critical data -- and stay out of the news

No matter what your business, information is likely one of your most valuable assets -- both to you and to the attacker. With so many breaches in the news -- from Sony to Epsilon to Heartland Payment Systems -- we all must understand what sensitive data we have, the risk associated with the data, and how to protect it.

But before we can do anything, we must know what sensitive data we have and where it's stored. We can’t protect what we don’t know about. Finding the data could be very simple or hugely complex, depending on your organization.

The first step is to list all of the sensitive data types your organization handles: employee personal information, customers' personally identifiable information, cardholder data, medical records, and corporate intellectual property, such as source code and transaction information. These data types will hold different risks for different companies -- if you're a software company, for example, the loss of source code is more damaging than the loss of externally regulated data.

Once you have a list of what you believe is critical data, ask department heads to add any other data types they know of or believe should be included. Use this as an opportunity to also ask teams to identify the places where these data types are utilized. At this point, all data types should be added to the list -- later, you’ll filter and prioritize based on risk and which you can most easily protect.

Once you know what sensitive data you're looking for, you'll inevitably find it in places where it shouldn’t be stored. During a data audit years ago, I found credit card information in the /tmp directory on a server. A production support staff member was debugging a data load that couldn’t be reproduced in staging and dumped the data load into the /tmp directory to review the data structure against that of staging. Once finding the problem, he forgot to remove the dump, thus leaving it there for anyone who accessed the server.

After you've talked with your people about what types of data to look for and where it might be found, the best way to find sensitive data is to scan and monitor for it. Most data in your organization can be fingerprinted in a way that allows for searching. For instance, credit card numbers and Social Security numbers follow a predefined format that’s well-documented.

There are literally dozens of pages on Google that show you how to search for these data types, utilizing everything from OpenDLP to simple Perl, PHP, and Python scripts. Custom data types, such as source code, typically can be fingerprinted and searched using header information added to the code on check in; standard comments that may apply to all files, such as copyright information; or by searching for common strings that appear in the code, such as variable names or custom include files.

If your source code doesn’t have something unique defined in every file checked in, then add a unique signature to every file so that it can be searched for in the future. This process can be automated through most source control software. Utilizing OpenDLP or a commercial alternative to perform searches -- rather than writing your own scripts -- is a great way to start. Of course, OpenDLP is Windows-centric, so other solutions might be required.

What if you need to search for data that is in an uncommon format? Easy. Find the pattern, break out your favorite regular expression helper, like Reggy for OS X, or call your resident regex guru and build your own regex to reduce false positives.

There are two types of data that you will run into when searching: structured and unstructured. Within these, there will be countless formats -- and these formats will be the bane of your search. Structured data is data stored in a known format -- such as in a database. Unstructured data is data that is stored in unpredictable or various formats, such as Word files, text files, Excel files, or any other random format.

Searching plain text files, XML files, and databases is pretty straightforward. You connect to the database and run a query across each table or for flat file. Search the file system, iterate over each line of each file, and you’re pretty much done.

Archives, Word files, Excel files, and data stored externally will cause more of a problem -- not to mention the data types that aren’t well-documented or predictable. If you’re trying to roll your own solution, then Perl, Python, Java, and just about every other language offer libraries of common file formats. All of the commercial data loss prevention (DLP) and search products can identify common file formats, too.

What if your sensitive data is stored in the cloud? Many enterprises don't have good visibility into data that is stored and shared externally -- whether it's a cloud services provider or a third-party service, such as the ones that might be managing your payroll or benefits.

Next: How to find data in the cloud.

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
News
Inside the Ransomware Campaigns Targeting Exchange Servers
Kelly Sheridan, Staff Editor, Dark Reading,  4/2/2021
Commentary
Beyond MITRE ATT&CK: The Case for a New Cyber Kill Chain
Rik Turner, Principal Analyst, Infrastructure Solutions, Omdia,  3/30/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you today!
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-3493
PUBLISHED: 2021-04-17
The overlayfs implementation in the linux kernel did not properly validate with respect to user namespaces the setting of file capabilities on files in an underlying file system. Due to the combination of unprivileged user namespaces along with a patch carried in the Ubuntu kernel to allow unprivile...
CVE-2021-3492
PUBLISHED: 2021-04-17
Shiftfs, an out-of-tree stacking file system included in Ubuntu Linux kernels, did not properly handle faults occurring during copy_from_user() correctly. These could lead to either a double-free situation or memory not being freed at all. An attacker could use this to cause a denial of service (ker...
CVE-2020-2509
PUBLISHED: 2021-04-17
A command injection vulnerability has been reported to affect QTS and QuTS hero. If exploited, this vulnerability allows attackers to execute arbitrary commands in a compromised application. We have already fixed this vulnerability in the following versions: QTS 4.5.2.1566 Build 20210202 and later Q...
CVE-2020-36195
PUBLISHED: 2021-04-17
An SQL injection vulnerability has been reported to affect QNAP NAS running Multimedia Console or the Media Streaming add-on. If exploited, the vulnerability allows remote attackers to obtain application information. QNAP has already fixed this vulnerability in the following versions of Multimedia C...
CVE-2021-29445
PUBLISHED: 2021-04-16
jose-node-esm-runtime is an npm package which provides a number of cryptographic functions. In versions prior to 3.11.4 the AES_CBC_HMAC_SHA2 Algorithm (A128CBC-HS256, A192CBC-HS384, A256CBC-HS512) decryption would always execute both HMAC tag verification and CBC decryption, if either failed `JWEDe...