The end goal is having an intelligent understanding of the events within the organization. To reach this goal, one of the things we must do is extract meaningful information from logs while reducing the false positives and redundant information. Searching online leads to good -- and not-so good -- examples of how to perform incident-alerting using commercial or open-source tools.
Let’s assume we’re working for a standard, run-of-the-mill corporation tasked with implementing log monitoring, correlation, alerting, and performing some basic tuning. So budget is limited, and we must achieve all of this on the cheap. Sound familiar?
The first step is defining the types of events we want to capture. These events should relate to the risks requiring response. In the beginning, we’ll focus on data that is easily logged, correlated, and acted on. After those event types are well-implemented, we move to event types that are high volume, less accurate, or that provide less value by themselves and require more correlation to be actionable.
Gathering data from systems isn’t too hard these days. Most systems support logging to a remote syslog server where logs can be centrally store and processed. Applications logging to databases or flat files are easily indexed using Perl and Sys::Syslog. Windows doesn’t support syslog by default, but there are plenty of solutions to solve this problem as well. If you will need to conduct file integrity checking at some point, then OSSEC might be a good tool. If you are only interested in the task at hand, then you could choose Snare.
Now that we know how to gather our data for processing, it's time to determine how we’ll actually process the data into actionable alerts. Doing this with free tools requires more work but is doable: Just be ready to roll up your sleeves a bit.
First off, we must decide what to use to collect our logs. Splunk and OSSIM both have open-source versions that provide a packaged solution providing event collection, correlation, alerting, and a Web interface. Each has its limitations but provides a solution to get up and running quickly. If you’re more of the do-it-yourself type, Syslog-ng provides collection, event classification, and correlation, but lacks a pretty front-end. If you’re in the mood to get your hands dirty or have other requirements, try SEC.
So,logs are flowing from our sources into a central place, and we are processing some test alerts, Now we're ready to correlate. Where do we start? That depends on what you want to respond to. There are a few key event classes within any organization: authentication, object change, threat, and state change events are common, actionable, and critical to intelligently defending the organization.
Let’s start with authentication events since these are common. When a user logs into a server via SSH, several events could be logged:
Oct 19 20:08:05 hackfoo sshd: pam_unix(sshd:session): session opened for user adamely by (uid=0) Oct 19 20:08:04 hackfoo sshd: Accepted publickey for adamely from 192.168.1.4 port 52311 ssh2 Oct 19 20:08:04 hackfoo sshd: [ID 800047 auth.info] Failed none for adamely from 192.168.1.4 port 37624 ssh2
Notice we have a failed login event, followed by a successful login via publickey, and finally an entry recording the opening of the user’s session.
The first event was actually not a failed login. OpenSSH tests for empty password authentication by trying to authenticate without a password by default, thus giving us our first false positive to work with. This entry will be present only if notices are being logged. It is not logged at higher logging levels.
To correlate these entries into a single event, we must group by commonalities between the entries. These entries all share the same application and process id. In our chosen correlation solution, we’d correlate sshd events that have the same process id: 20731 in our example. Now we know every sshd event related to the user’s session.
This is great, but what we really need is actionable information. We want to report on failed sshd login attempts so we can detect those pesky hackers. We start with the same rule we entered previously, but extend it to look only for entries with application sshd, the same process id, and containing the word Failed. Splunk has a decent how-to on this that provides good information relevant to any solution. Just remember that neither Splunk's example nor ours account for all entry combinations due to logging levels and authentication types.
Data intelligence is critical to protecting the organization. We can have intelligent information only once we have intelligent log management through collection, classification, and correlation. So ensure that proper logging is enabled on systems and within applications, and that logs are backhauled to a central place for processing. Then begin walking through the threats and associated log entries. Add correlation, reporting, alerting, and you’re well on your way to responding to security events the smart way.
Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.