Tech Insight: How Attackers Use Your Metadata Against You

Vulnerabilities & Threats

Using easily accessible data about your files, bad guys can wreak havoc on your sensitive information

February 13, 2009

5 Min Read

A Special Analysis For Dark Reading
First of Two Articles

To steal your identity, a cybercriminal doesn't have to have direct access to your bank account or other personal information. Often, he collects information about you from a variety of seemingly innocuous sources, then uses that data to map out a strategy to crack your online defenses and drain your accounts.

Such methods are well-known to security professionals. But what those same professionals often overlook is this approach also can be used to crack the defenses of sensitive business files, as well. Rather than trying to gain access to your data, itself, the bad guys are analyzing the so-called harmless information about your files -- collectively known as metadata -- and using it to develop attacks that can drain your business of its most sensitive information.

Metadata is a powerful feature of many document and file types, including Microsoft Office documents, PDFs, JPGs, ZIP files, and multimedia formats. Depending on the application and the file, metadata might contain information such as author names, user names, version of the software used to create the file, the user's operating system, and sometimes even the computer's MAC address. Armed with this data, an attacker can develop exploits that might work not only on a specific file, but on all similar file types in an enterprise.

Armed with this data, an attacker can target users, as well as the computing environment within their enterprises. Several instances of metadata mishaps have been in the news in recent years. In one case, attackers used data they collected from the "track changes" feature in Microsoft Word. In another case, they took advantage of failed attempts to black out data in PDF files.

These cases make it clear: Once your documents leave the internal network -- either through email or Web publishing -- those files and the metadata they contain are fair game for attackers.

Many security professionals know about metadata, but they don't really know how it can be used against their organizations. The first stage of leveraging metadata for an attack is gathering it. Both attackers and pen testers have a bevy of tools available solely for this purpose.

The simplest way to gather the data is by using the native tool that created the document. For example, Word Document metadata can be viewed within the Properties menu option in Microsoft Word, or by enabling the viewing of previous edits with the "Track Changes" option. Similarly, Adobe Acrobat can display PDF metadata.

While manual extraction of metadata using native tools is definitely effective, it is possible to miss some of the hidden metadata. Plus, the process is slow and monotonous. Two readily-available hacking tools -- MetaGooFil and CeWL -- were created to expedite the collection process by automating the search, download, and extraction of metadata from documents available on the Internet.

MetaGooFil was the first tool on the scene, and it uses Google to search for files of specific type. Once it finds and downloads files, the metadata is extracted and displayed in a HTML report that shows the information found in each file. The end of the report includes a summary of authors and file paths -- information that can be important later on, during other attack phases.

CeWL takes a different approach, spidering a Website to create a word list that can be used for password brute-forcing. It can also collect email addresses, authors, and user names from metadata found in Microsoft Office documents. Included with CeWL is a "Files Already Bagged" (FAB) tool that processes files already acquired.

Once collected, metadata can be used in many different attack techniques. Password brute-forcing is one of the most commmon. An attacker takes the word list created by CeWL and uses it against account names found in metadata. The actual account names can be found from the author field, email addresses, and file paths (e.g., C:\Documents and Settings\User007).

Metadata is also helpful in social engineering attacks. Knowing the five different authors of a document, an attacker can "drop names" via the phone to make his scheme seem more credible. Similarly, location information contained in photos could be mentioned, making the calls seem more legit.

Spear-phishing email could target all of the authors who worked on one particular document. Knowing which version of software was used to create the file, an attacker could also email client-side exploits to individuals who use particularly vulnerable versions of Microsoft Word or PowerPoint.

Metadata can also help with physical theft. For example, users may post images to Flickr or Twitter from a phone that enables geotagging. This information can give attackers the location about a target's home or business, and where he might be on a daily basis. Similarly, the MAC address of the system can indicate the type of hardware used, making it easier to identify mobile workers who are likely to have laptops that are kept in places where they might be easy to steal.

Metadata is commonly overlooked in corporate security defenses, but it can lead to disastrous results if used by a knowledgeable attacker. If you want to know more, read Larry Pesce's excellent GCIH certification paper, "Document Metadata, The Silent Killer." It's a great read for anyone who wants to learn more about the dangers of metadata.

In our next Tech Insight, we'll look at how you can build defenses that limit an attacker's ability to collect and use metadata.

Have a comment on this story? Please click "Discuss" below. If you'd like to contact Dark Reading's editors directly, send us a message

About the Author(s)

Dark Reading Staff

Dark Reading

Dark Reading is a leading cybersecurity media site.

See more from Dark Reading Staff

Related Topics

Related Topics

Related Topics

Related Topics

About the Author(s)

Editor's Choice