Risk

10/15/2015
11:00 AM
Joshua Goldfarb
Joshua Goldfarb
Commentary
Connect Directly
Twitter
RSS
E-Mail vvv
50%
50%

An Atypical Approach To DNS

It's now possible to architect network instrumentation to collect fewer data sources of higher value to security operations. Here's how -- and why -- you should care.

The idea for this column started when a colleague asked me a few questions about analyzing metadata generated from network traffic. We discussed a few possibilities, including various different types of metadata. A few minutes after our call, I recalled an email thread I had seen discussing ways to collect metadata about DNS transactions. Most of the solutions offered on that thread involved deploying new equipment to collect additional telemetry data. 

Suffice it to say, based on the discussions I have with customers, yet another piece of equipment to deploy, manage, and maintain is not high on their wish list. On the other hand, DNS data is extremely valuable for security operations and incident response. So how can we rectify this discrepancy? Perhaps taking an atypical approach to DNS is the answer.

Almost five years ago, I gave a talk at the GFIRST conference entitled "Uber Data Source: Holy Grail or Final Fantasy?" In this talk, I proposed that given the volume and complexity that a larger number of highly specialized data sources brings to security operations, it makes sense to think about moving towards a smaller number of more generalized data sources.  

One could also imagine taking this concept further, ultimately resulting in an "uber data source." Or, how about a more detailed discussion working within the context of network traffic data sources?  I consider host (AV logs), system (Linux syslogs), and/or application level (web server logs) data sources beyond the context of what I would consider generalizing into an uber data source, at least for network traffic data and meta-data.

Begin at the beginning

Let's start by looking at the current state of security operations in most organizations, specifically as it relates to network traffic data log collection. In most organizations, a large number of highly specialized network traffic data sources are collected. This creates a complex ecosystem of logs that clouds the operational workflow.  

In my experience, the first question asked by an analyst when developing new alerting content or performing incident response is "To which data source or data sources do I go to find the data I need?" I would suggest that this wastes precious resources and time. Rather, the analyst's first question should be "What questions do I need to ask of the data in order to accomplish what I have set out to do?" This necessitates a "go to" data source -- the "uber data source."

Additionally, it is helpful to highlight the difference between data value and data volume. Each data source that an organization collects will have a certain value, relevance, and usefulness to security operations. Similarly, each data source will also produce a certain volume of data when collected and warehoused. Data value and data volume do not necessarily correlate. For example, firewall logs often consume 80% of an organization's log storage resources, but actually prove quite difficult to work with when developing alerting content or performing incident response. Conversely, as an illustrative example, DHCP logs provide valuable insight to security operations, but are relatively low volume.

There is also another angle to the data value vs. data volume point. As you can imagine, collecting a large volume of less valuable logs creates two issues, among others:

  • Storage is consumed more quickly, thus reducing the retention period. (This can have a detrimental effect on security operations when performing incident response, particularly around intrusions that have been present on the network for quite some time.)
  • Queries return more slowly due to the larger volume of data. (This can have a detrimental effect on security operations when performing incident response, since answers to important questions come more slowly.)

Those who disagree with me will argue: "I can't articulate what it is, but I know that when it comes time to perform incident response, I will need something from those other data sources."  To those people, I would ask: If you're collecting so much data, irrespective of its value to security operations, that your retention period is cut to less than 30 days and your queries take hours or days to run, are you really able to use that data you've collected for incident response?  I think not.

If we take a step back, we see that the current "give me everything" approach to log collection involves a large number of highly specialized data sources. This is the case for a variety of reasons, but historical reasons and a lack of understanding regarding each data source's value to security operations are among them.  

A better way

If we think about what these data sources are conceptually, we see that they are essentially metadata from layer 4 of the OSI model (the transport layer) enriched with specific data from layer 7 of the OSI model (the application later) suiting the purpose of that particular data source. For example, DNS logs are essentially metadata from layer 4 of the OSI model enriched with additional contextual information regarding DNS queries and responses found in layer 7 of the OSI model. I would assert that there is a better way to operate without adversely affecting network visibility.

The question I asked back in 2011 was "Why not generalize this?" Why collect DNS logs as a specialized data source when the same visibility can be provided as part of a more generalized data source of higher value to security operations? It is now possible to architect network instrumentation to collect fewer data sources of higher value to security operations. This has several benefits:

  • Less redundancy and wastefulness across data sources
  • Less confusion surrounding where to go to get the required data
  • Reduced storage cost or increased retention period at the same storage cost
  • Improved query performance

There is no doubt in my mind that DNS is an incredibly valuable data source for security operations and incident response. Unfortunately, many organizations do not collect DNS data, only to find out later that not doing so has caused them great harm. Based on feedback I receive, one of the main reasons more organizations do not collect DNS data is because of the complexity, load, and cost associated with doing so. This is where the concept of the "uber data source" can help. Providing organizations a simpler, lighter-weight and cost-effective way to collect, retain, and analyze valuable layer 7 enriched metadata is a way to encourage them to collect DNS data, along with other valuable metadata. In this case, I think less is more.

Josh (Twitter: @ananalytical) is an experienced information security leader with broad experience building and running Security Operations Centers (SOCs). Josh is currently co-founder and chief product officer at IDRRA and also serves as security advisor to ExtraHop. Prior to ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
High Stress Levels Impacting CISOs Physically, Mentally
Jai Vijayan, Freelance writer,  2/14/2019
Valentine's Emails Laced with Gandcrab Ransomware
Kelly Sheridan, Staff Editor, Dark Reading,  2/14/2019
Making the Case for a Cybersecurity Moon Shot
Adam Shostack, Consultant, Entrepreneur, Technologist, Game Designer,  2/19/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
5 Emerging Cyber Threats to Watch for in 2019
Online attackers are constantly developing new, innovative ways to break into the enterprise. This Dark Reading Tech Digest gives an in-depth look at five emerging attack trends and exploits your security team should look out for, along with helpful recommendations on how you can prevent your organization from falling victim.
Flash Poll
How Enterprises Are Attacking the Cybersecurity Problem
How Enterprises Are Attacking the Cybersecurity Problem
Data breach fears and the need to comply with regulations such as GDPR are two major drivers increased spending on security products and technologies. But other factors are contributing to the trend as well. Find out more about how enterprises are attacking the cybersecurity problem by reading our report today.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-8980
PUBLISHED: 2019-02-21
A memory leak in the kernel_read_file function in fs/exec.c in the Linux kernel through 4.20.11 allows attackers to cause a denial of service (memory consumption) by triggering vfs_read failures.
CVE-2019-8979
PUBLISHED: 2019-02-21
Koseven through 3.3.9, and Kohana through 3.3.6, has SQL Injection when the order_by() parameter can be controlled.
CVE-2013-7469
PUBLISHED: 2019-02-21
Seafile through 6.2.11 always uses the same Initialization Vector (IV) with Cipher Block Chaining (CBC) Mode to encrypt private data, making it easier to conduct chosen-plaintext attacks or dictionary attacks.
CVE-2018-20146
PUBLISHED: 2019-02-21
An issue was discovered in Liquidware ProfileUnity before 6.8.0 with Liquidware FlexApp before 6.8.0. A local user could obtain administrator rights, as demonstrated by use of PowerShell.
CVE-2019-5727
PUBLISHED: 2019-02-21
Splunk Web in Splunk Enterprise 6.5.x before 6.5.5, 6.4.x before 6.4.9, 6.3.x before 6.3.12, 6.2.x before 6.2.14, 6.1.x before 6.1.14, and 6.0.x before 6.0.15 and Splunk Light before 6.6.0 has Persistent XSS, aka SPL-138827.