Attacks/Breaches
5/20/2011
01:03 PM
Connect Directly
RSS
E-Mail
50%
50%

Tech Insight: Finding And Securing Your Enterprise's Most Sensitive Data

The headlines are full of companies facing serious breaches. Here are some basic steps to protect your enterprise's critical data -- and stay out of the news

No matter what your business, information is likely one of your most valuable assets -- both to you and to the attacker. With so many breaches in the news -- from Sony to Epsilon to Heartland Payment Systems -- we all must understand what sensitive data we have, the risk associated with the data, and how to protect it.

But before we can do anything, we must know what sensitive data we have and where it's stored. We can’t protect what we don’t know about. Finding the data could be very simple or hugely complex, depending on your organization.

The first step is to list all of the sensitive data types your organization handles: employee personal information, customers' personally identifiable information, cardholder data, medical records, and corporate intellectual property, such as source code and transaction information. These data types will hold different risks for different companies -- if you're a software company, for example, the loss of source code is more damaging than the loss of externally regulated data.

Once you have a list of what you believe is critical data, ask department heads to add any other data types they know of or believe should be included. Use this as an opportunity to also ask teams to identify the places where these data types are utilized. At this point, all data types should be added to the list -- later, you’ll filter and prioritize based on risk and which you can most easily protect.

Once you know what sensitive data you're looking for, you'll inevitably find it in places where it shouldn’t be stored. During a data audit years ago, I found credit card information in the /tmp directory on a server. A production support staff member was debugging a data load that couldn’t be reproduced in staging and dumped the data load into the /tmp directory to review the data structure against that of staging. Once finding the problem, he forgot to remove the dump, thus leaving it there for anyone who accessed the server.

After you've talked with your people about what types of data to look for and where it might be found, the best way to find sensitive data is to scan and monitor for it. Most data in your organization can be fingerprinted in a way that allows for searching. For instance, credit card numbers and Social Security numbers follow a predefined format that’s well-documented.

There are literally dozens of pages on Google that show you how to search for these data types, utilizing everything from OpenDLP to simple Perl, PHP, and Python scripts. Custom data types, such as source code, typically can be fingerprinted and searched using header information added to the code on check in; standard comments that may apply to all files, such as copyright information; or by searching for common strings that appear in the code, such as variable names or custom include files.

If your source code doesn’t have something unique defined in every file checked in, then add a unique signature to every file so that it can be searched for in the future. This process can be automated through most source control software. Utilizing OpenDLP or a commercial alternative to perform searches -- rather than writing your own scripts -- is a great way to start. Of course, OpenDLP is Windows-centric, so other solutions might be required.

What if you need to search for data that is in an uncommon format? Easy. Find the pattern, break out your favorite regular expression helper, like Reggy for OS X, or call your resident regex guru and build your own regex to reduce false positives.

There are two types of data that you will run into when searching: structured and unstructured. Within these, there will be countless formats -- and these formats will be the bane of your search. Structured data is data stored in a known format -- such as in a database. Unstructured data is data that is stored in unpredictable or various formats, such as Word files, text files, Excel files, or any other random format.

Searching plain text files, XML files, and databases is pretty straightforward. You connect to the database and run a query across each table or for flat file. Search the file system, iterate over each line of each file, and you’re pretty much done.

Archives, Word files, Excel files, and data stored externally will cause more of a problem -- not to mention the data types that aren’t well-documented or predictable. If you’re trying to roll your own solution, then Perl, Python, Java, and just about every other language offer libraries of common file formats. All of the commercial data loss prevention (DLP) and search products can identify common file formats, too.

What if your sensitive data is stored in the cloud? Many enterprises don't have good visibility into data that is stored and shared externally -- whether it's a cloud services provider or a third-party service, such as the ones that might be managing your payroll or benefits.

Next: How to find data in the cloud.

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Register for Dark Reading Newsletters
Partner Perspectives
What's This?
In a digital world inundated with advanced security threats, Intel Security seeks to transform how we live and work to keep our information secure. Through hardware and software development, Intel Security delivers robust solutions that integrate security into every layer of every digital device. In combining the security expertise of McAfee with the innovation, performance, and trust of Intel, this vision becomes a reality.

As we rely on technology to enhance our everyday and business life, we must too consider the security of the intellectual property and confidential data that is housed on these devices. As we increase the number of devices we use, we increase the number of gateways and opportunity for security threats. Intel Security takes the “security connected” approach to ensure that every device is secure, and that all security solutions are seamlessly integrated.
Featured Writers
White Papers
Cartoon
Current Issue
Dark Reading's October Tech Digest
Fast data analysis can stymie attacks and strengthen enterprise security. Does your team have the data smarts?
Flash Poll
Video
Slideshows
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2013-7407
Published: 2014-10-22
Cross-site request forgery (CSRF) vulnerability in the MRBS module for Drupal allows remote attackers to hijack the authentication of unspecified victims via unknown vectors.

CVE-2014-3675
Published: 2014-10-22
Shim allows remote attackers to cause a denial of service (out-of-bounds read) via a crafted DHCPv6 packet.

CVE-2014-3676
Published: 2014-10-22
Heap-based buffer overflow in Shim allows remote attackers to execute arbitrary code via a crafted IPv6 address, related to the "tftp:// DHCPv6 boot option."

CVE-2014-3677
Published: 2014-10-22
Unspecified vulnerability in Shim might allow attackers to execute arbitrary code via a crafted MOK list, which triggers memory corruption.

CVE-2014-4448
Published: 2014-10-22
House Arrest in Apple iOS before 8.1 relies on the hardware UID for its encryption key, which makes it easier for physically proximate attackers to obtain sensitive information from a Documents directory by obtaining this UID.

Best of the Web
Dark Reading Radio
Archived Dark Reading Radio
Follow Dark Reading editors into the field as they talk with noted experts from the security world.