You might say, "What are you talking about? There’s plenty! There are events per second, transactions per second, [mega|giga|tera|peta|exa]bytes of data, millions of malware samples, millions of botnet victims, number of false positives …"
But that’s not all that goes into marketing these days. Tell me, just how big do you have to get before you get to call yourself "Big Data"? What’s the number, and can we tell everyone who doesn’t meet that number, "Thanks for playing"? Or is the race just about "Whatever number they have, ours is bigger" ad infinitum (and ad nauseam)? Maybe everyone should just claim to be storing LOTTABYTES and be done with it.
Almost as soon as "Big Data" came along, there was someone to explain that it wasn’t the size that mattered; it was how you used it. (Maybe they were feeling inadequate and were compensating.) One of the first "new" metrics was based around speed: either throughput, or how close to real-time the processing occurred. Vendors touted their "line speed" or their ability to do all their analysis in-memory (since writing to disk tends to slow down the pipe a lot).
But what does speed matter, if the coverage is spotty? We’ve known for a long time that stateful firewalls, IDS/IPS and web application firewalls magically get a lot faster if you turn enough high-level checks off. Or, if you must have everything turned on – if you’ve gotta catch ‘em all – offloading them to specialized processors can keep the main traffic flowing unimpeded. But then you could argue that that’s cheating, and it’s not as close to real-time anymore.
Vendors also tout the number of inputs that go into their offerings: how many other security technologies they integrate with (where "integrate" may just mean "we consume syslog, CSV and XML"). If you want to get fancier than just saying what data formats you accept, you can say you have an API, regardless of how many other tools actually use it. (When it comes to integration claims, I think API is the new XML, but someone may want to dispute that.)
Now that we’ve put size, speed and coverage to bed, someone’s going to bring up "intelligence." How do you rate the analytics offered today with most monitoring systems? Do you measure it by the number of patents held by each vendor for their algorithms? Is it whether their data scientists went to Stanford or MIT? How about the number of factors they use to calculate their risk and severity scores? What’s the IQ of a SIEM?
After the analytics skirmishes, the other kind of "intelligence" came up, namely the number and variety of additional inputs to the algorithms: reputation, geolocation, indicators of compromise, or possibly the number of former government intelligence analysts in the research team (and/or on the board of directors).
It’s extremely hard to measure and compare intelligence in this context, so some vendors resort to counting false positives. I’m dubious about how well that works, since a false positive can be in the eye of the beholder. If an alert has to travel through only two levels of analyst instead of three before it gets discounted, is it "less false"?
And then it’s back to numbers: the number of external intelligence feeds that are used to enrich the data that the monitoring system processes. (Still with me? Stay with the group; don’t get lost.) But are ten feeds necessarily better than one? Are a hundred feeds better than ten? How much more confidence are you getting, and after which number of feeds does the confidence level plateau?
Finally, the latest attempt at differentiation uses the word "actionable." Again, how do you measure that? The word connotes a binary condition: either you can do something with it, or you can’t. Can one system produce data that is "more actionable" than another one, and if so, how do you prove it?
I expect that the next salvo fired in the Monitoring Metrics Wars will be the originality or uniqueness of the data. Perhaps the freshness, too. Not only will the data be processed "live" (which is supposed to be better than "real-time," I understand – or maybe it’s the other way around), but it’ll be newer than anyone else’s data, still dewy from the data fields. It’ll be organic, locally sourced, internally generated, and home-made. Just like Mother used to analyze.
One thing’s for sure: buyers will still be wading through the marketing morass, trying to search out bits of dry land that will hold up to a purchasing decision. Not only will they have trouble differentiating vendors and their offerings; they’ll also struggle to find metrics that tell them when their monitoring is good enough. There are few comparisons out there that are both objective and complete. But I personally would pay good money to see an Actionability Bakeoff.
Wendy Nather is Research Director of the Enterprise Security Practice at the independent analyst firm 451 Research. You can find her on Twitter as @451wendy.