Log Management Spurs Data Collection Debate

Cybersecurity Analytics

First you have to know what to collect before you can analyze all of the data you gather

May 3, 2011

4 Min Read

As log management and security information and event management (SIEM) experts pore over the latest results from the annual SANS survey on log management, debate lingers over whether organizations really have mastered the art of useful data collection, or whether they need to adjust their log collection behaviors to better enable more analysis down the road.

At first blush, consensus from the SANS report seems to be that most organizations have mastered log data collection, so now it is time to worry about such things as log data search, categorization, and correlation.

"We've got the collection down, and we've got the securing the logs and the chain of custody and those things that make the compliance auditors happy, but actually turning this information into something that is meaningful and actionable is the challenge," says Michael Maloof, CTO at TriGeo Network Security.

However, when data comes in such an avalanche of information that the tools at hand are still not able to give organizations a consistent way to sift through it, then how much collection is too much?

Some might argue that the better a job organizations do with collection without improving their ability to categorize data and search through it, the more likely they are to have lots of meaningless information drown out the important data. This point brings up a long-raging debate about how much information organizations really should be collecting. Many experts believe that organizations need to temper and focus their collection efforts for a long while before they can catch up with analysis of all data sets.

"First of all, ask yourself, can your event collection be more focused?" says Scott Crawford a research director with Enterprise Management Associates. "Do you necessarily have to pick up data from everywhere, or are there key points where you really do need insight or where insight would be more valuable, rather than collecting all of it?"

According to Andrew Hay, senior security analyst with The 451 Group, the issue of deciding which data to collect is a balancing act. "There are two schools of thought. One is that some organizations say, 'I'm going to log absolutely everything and anything,' and then that becomes a management nightmare. Logging everything for any sort of real-time analytics or security operations is going to be very difficult," Hay says. "You really need to understand what those logs are before you log them. So the other camp says, 'Only log what you need.' But the challenge is, how many organizations really understand what they need?"

It is that question that makes Dr. Anton Chuvakin of Security Warrior Consulting lean toward amassing as much log data as possible at first, and then worrying more about how that data is reviewed.

"If you're in doubt, just collect it," he says. "The filter you apply is what you actually review and what you take action on. I would prefer to err on the side of too much data all of the time. Essentially you want to collect more data, but review less of it. That's the magic trick."

And, Chuvakin says, the only way to review more effectively is to practice.

"I would say if you can get daily, maybe weekly, log reviews in a consistent manner, then you can know better what to do with the data. You know when to scream and when to relax," he says. "If you have a repeatable, consistent process for log review, then you will detect your intrusions and you'd save more time and eventually understand where you could automate in correlations and with real-time tracking. Log review processes help to figure out what's normal, figure out what's not, and take action. To me that is more important than how to tune correlation rules; you learn that later."

Regardless of how many data feeds your organization depends on, the sheer volume of logs can actually be put to good use in and of itself, Crawford suggests.

"There are ways to take a different look at log data that might be indicative of an issue. Rather than looking at every single event and correlating individual events for possibility of high-risk activities, [look for] changes in log volume," he says. "These are things I would consider 'second-order' indicators. Sometimes an attack might itself create a volume of log data, so you see spikes and changes in the average amount of data. Conversely, if log data really dried up from a given source, it would suggest someone is either covering their tracks, has interfered with a service, or created some other disruption we should be aware of."

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.

About the Author(s)

Dark Reading Staff

Dark Reading

Dark Reading is a leading cybersecurity media site.

See more from Dark Reading Staff

Related Topics

Related Topics

Related Topics

Related Topics

About the Author(s)

Editor's Choice