Big Data Security Or SIEM Buzzword Parity?If you attended the 2012 RSA Security Conference, BSides San Francisco, or the America’s Growth Capital Summit, you no doubt noticed claims of SIEM vendors jumping on the 'big data security' bandwagon
I doubt I could find anyone who would argue that there wasn't a wealth of security-pertinent data made available by the various deployed technical controls and corresponding user actions in an enterprise environment.
An argument that many would likely join, however, is the question of what data is relevant in a security context. Some might say that only network-level logs (such as firewall or IPS logs) and user-access-related logs are required, whereas others might include endpoint security logs, proxy-related logs, and maybe even deep packet inspection data. Something that we can likely all agree on, however, is that having access to information that might be required is likely better than lamenting not having access to it in the midst of a security incident.
The fact is, security has become a "big data" problem. If organizations want to collect all data (and we do mean ALL data) on the off chance that it might contain information pertinent to the success of the security program, then organizations need to start thinking less about security as a tangible defensive control and more as an abstraction layer atop enterprise data.
Just like they did for the security log management problem in the late 90s, SIEM vendors are now positioning themselves as the solution to the big-data security problem. After all, they already collect, correlate, and normalize disparate logs from various security controls and provide a window (via dashboards, search mechanisms, and report generation) into what's going on from a security standpoint.
Unfortunately, SIEM solutions were first invented to handle large volumes of data (usually from firewall, IDS, and router logs) with little variety (for example, standard syslog parsing) and a fairly consistent and predictable velocity. On the last point, sure there was the expectation of data bursts, but nothing of the magnitude of big data requirements. Also, the concepts of totality and exploration of the data have only been buoyed in the past few years with more organizations looking to extend SIEM monitoring beyond traditional security-centric (and often canned) constructs.
With the opening of the SIEM data repositories via APIs, third-party integration partners are pushing the frequency and dependency aspects beyond what the systems were ever intended to openly share past their respective borders -- resulting in never-before imagined bottlenecks and battles for critical system resources.
So why can't traditional SIEM products keep up with requirements? Well, there has been very little innovation in a technology with its roots in the late 90s and early 2000s. Unfortunately, the old adage of "if it ain't broke, don't fix it" applies in this sector. When we talk about big data as a big amorphous blob of data that may or may not have relevance to our security program, we find it hard to assign a sense of scale.
As an example of big data scale, we'll use a project at The Hospital for Sick Children in Toronto. That organization is leveraging IBM's InfoSphere Streams software to up to 1,000 readings per second from instrumented neonatal intensive-care beds in order to monitor the vital signs of premature babies and alert staff to the early signs of potentially life-threatening conditions. Although not a traditional enterprise security example, a security concern does come into play as increasingly more systems within the healthcare industry become interconnected and remotely managed.
You should also be able to easily see how monitoring of this nature could easily translate into the monitoring of SCADA or Industrial Control System environments, the financial trading floor, or any other industry where equipment requires constant real-time process monitoring in addition to a technological security requirement.
Security is a constantly evolving problem and, as we move forward, we'll need access to additional disparate data sources above and beyond security controls if we hope to grasp what is happening in our organizations. Try to think five moves ahead like chess, and see if you can identify the problematic pain points for data and user protection in your enterprise beyond the scope of your current control.
If I could ingrain anything into your minds from this blog post, it's that a SIEM solution that was incubated 10 years ago will likely be unable to claim true big-data-security support without embracing big data technology and concepts. There simply isn't a big-data "easy button," regardless of what you might be told.
To learn more about big data security, why not join me, Forrester’s John Kindervag, and Splunk’s Mark Seward in Austin this coming weekend for South by Southwest Interactive and our panel called the "Big Data Smackdown on Cybersecurity."
Andrew Hay is senior analyst with 451 Research's Enterprise Security Practice (ESP) and is an author of three network security books. Follow him on Twitter: http://twitter.com/andrewsmhay