Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


09:29 AM

Metadata Poses Both Risks And Rewards

For companies, metadata can both be an opportunity to better secure the business and a threat that leaks sensitive data

The National Security Agency's focus on metadata has raised awareness of the threat that activity tracking poses to individual privacy and has renewed debates over the level of monitoring that should be permissible by government and businesses.

For businesses, the lessons are more subtle. Organization can both inadvertently leak metadata -- giving adversaries a look into their operations and a potential covert communications channel -- and analyze their own metadata to gain information on anomalous activity within their network. Metadata, a by-product of the adoption of technology, should be helpful -- and can be -- if companies are aware of the issues posed by the data, says Will Irace, vice president of threat research at General Dynamics Fidelis Cybersecurity Solutions.

"I don't look at metadata as some boogeyman," he says. "Instead, we have to figure out how to distill knowledge from the massive amounts of raw information that we are collecting."

Metadata arrived in the lexicon of everyday technology users in 2013, when the leak of classified documents from the National Security Agency highlighted the amount of information collected by service providers and requested by the government. While the U.S. government is barred from collecting the content of communications without a warrant, metadata -- loosely defined as data about data -- has historically been fair game. Yet metadata is as important -- and many technologists argue, more important -- than the content of messages or documents because it can be used to create mappings of the relationships between content and the creators of that content.

[Establishing 'normal' behaviors, traffics, and patterns across the network makes it easier to spot previously unknown bad behavior. See Network Baseline Information Key To Detecting Anomalies.]

In an ongoing study using volunteers who allow their information to be tracked, Stanford University has found that significant information about participants can be inferred just from their phone metadata. In one instance, a subject contacted a home improvement store, locksmiths, a hydroponics dealer, and a head shop. In another instance, a participant made "calls to a firearm store that specializes in the AR semiautomatic rifle platform [and[ they also spoke at length with customer service for a firearm manufacturer that produces an AR line," according to a March 12 update on the research by Jonathan Mayer, a PhD student in computer science at Stanford University.

From a privacy perspective, the term "metadata" is typically used to identify what legal experts believe is data that can be collected, whether by business or government, without infringing on the privacy of citizens. Yet the MetaPhone project shows that such data about content still leaks significant privacy-infringing information, Mayer says.

"I think the notion of metadata and privacy as being separate ... is not born out," he says. "Even if you excise the personally identifiable information, someone could still re-identify the data set or make sensitive inferences. So getting rid of the PII does not get rid of the privacy problems."

While the MetaPhone project focused on data about who called whom, metadata includes a wide variety of machine-generated information: Browser histories, document information, network packet headers, and access logs are all common sources of metadata produced by companies and their employees. Attackers frequently seek out this information to use in reconnaissance against a targeted firm and gain valuable knowledge about their employees and network infrastructure.

While doing research on metadata leakage (PDF), Spanish security firm Eleven Paths created a tool that could mine the data from public documents available on a company's website. Because firms frequently do not sanitize the information placed in documents, attackers can gain information about who authored the file, when they created it, and on what type of machine. In a more recent 2013 study, the company found that data-loss prevention firms do not fully sanitize their own files and documents, leaking potentially sensitive information. In some cases, file servers and printers can also be revealed.

"A persistent attacker can create a piece of malware for a specific target and use information taken from documents to create a more targeted attack," says Chema Alonso, CEO of Eleven Paths. "By looking at metadata, they can identify people and figure out what internal servers they need to infect."

Yet companies that become more aware of metadata can collect data on and analyze their employees' activities to gain more visibility into their networks and detect anomalous activity. Frequently referred to as big data analytics, such monitoring and analysis projects can help companies identify what activities may need more scrutiny.

They are, however, not easy, says General Dynamics' Irace.

"Big data analytics brings to mind a magical black box that takes in all this raw data and produces a diamond of actionable knowledge, but it is much messier than that," he says. "It is much more human-driven."

Companies should dip their toe into collecting and analyzing metadata to gain experience and a grasp of what kinds of information should be collected and how the company should process it correctly, he says.

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message. Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 9/21/2020
Cybersecurity Bounces Back, but Talent Still Absent
Simone Petrella, Chief Executive Officer, CyberVista,  9/16/2020
Meet the Computer Scientist Who Helped Push for Paper Ballots
Kelly Jackson Higgins, Executive Editor at Dark Reading,  9/16/2020
Register for Dark Reading Newsletters
White Papers
Latest Comment: Exactly
Current Issue
Special Report: Computing's New Normal
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
How IT Security Organizations are Attacking the Cybersecurity Problem
How IT Security Organizations are Attacking the Cybersecurity Problem
The COVID-19 pandemic turned the world -- and enterprise computing -- on end. Here's a look at how cybersecurity teams are retrenching their defense strategies, rebuilding their teams, and selecting new technologies to stop the oncoming rise of online attacks.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2020-09-21
IBM WebSphere Application Server 7.0, 8.0, 8.5, and 9.0 is vulnerable to an XML External Entity Injection (XXE) attack when processing XML data. A remote attacker could exploit this vulnerability to expose sensitive information. IBM X-Force ID: 185590.
PUBLISHED: 2020-09-21
IBM WebSphere Application Server Liberty through running oauth-2.0 or openidConnectServer-1.0 server features is vulnerable to a denial of service attack conducted by an authenticated client. IBM X-Force ID: 184650.
PUBLISHED: 2020-09-21
IBM Aspera Web Application 1.9.14 PL1 is vulnerable to cross-site scripting. This vulnerability allows users to embed arbitrary JavaScript code in the Web UI thus altering the intended functionality potentially leading to credentials disclosure within a trusted session. IBM X-Force ID: 188055.
PUBLISHED: 2020-09-21
IBM Business Automation Content Analyzer on Cloud 1.0 does not set the secure attribute on authorization tokens or session cookies. Attackers may be able to get the cookie values by sending a http:// link to a user or by planting this link in a site the user goes to. The cookie will be sent to the i...
PUBLISHED: 2020-09-21
IBM DataPower Gateway 2018.4.1.0 through 2018.4.1.12 could allow a remote attacker to cause a denial of service by sending a specially crafted HTTP/2 request with invalid characters. IBM X-Force ID: 184438.