Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


10:30 AM
Jay Jacobs
Jay Jacobs
Connect Directly
E-Mail vvv

How To Put Data At The Heart Of Your Security Practice

First step: A good set of questions that seek out objective, measurable answers.

“When you can measure what you are speaking about, and express it in numbers, you know something about it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts advanced to the stage of science.” -- Lord Kelvin, 1883

Lord Kelvin wrote those words over 130 years ago. He was addressing a group of civil engineers on the topic of practical applications of electricity. While electricity as a physical science may (or may not) seem like a far jump from information security, how we improve our knowledge and understanding has remained relatively constant: we learn by observing and measuring our environment. Security programs are just now beginning to realize this lesson and there are a lot of questions being asked around data-driven security programs and how to build a security practice that has data analysis at the heart of the decision-making process. 

However, before we talk about how to approach the integration of data and measurement into the security decision-making process, let’s talk about how you should not to begin. This is important because many data-driven security programs are doomed in their first step because people don’t ask the right questions. Instead, organizations will look at the data they have and try to pull out things that are “interesting” or that they think will help drive their program. This only leads to metrics that are convenient and not necessarily useful and will end up wasting a lot of time and energy from everyone involved.

Anything worth doing is worth asking questions about
To build a data-driven security practice, start by defining a list of questions that, if answered, would help not only drive decisions, but also help how you evaluate how good those decisions were down the road. Defining such questions are tricky, because they can’t be just any old questions, they must seek out objective answers. As Bill James, who spent his life studying and reporting on the statistics of baseball, once said, “My job was to find questions about baseball that have objective answers, that’s all that I do, that’s all that I’ve done.” So rather than ask, “How secure am I?” perhaps a better question is “How many security events did we have last quarter?” Or maybe even dig deeper with, “What types of security events do we spend the most time on?” Through this approach, you can objectively answer, “How secure am I?” with multiple points that are grounded in data.

Another approach is to pause before you sign off on the next security purchase and ask what observable actions in the environment the team would expect to influence with this purchase. This is not an easy challenge; sometimes decisions are made to stop something that hasn’t happened yet. In that case, the questions may look externally, “How many breaches were disclosed by other organizations like us?” Taking an outside-in approach will broaden the sources of data and help answer some those tough questions. 

Once you have a list of questions that you’d like answered, look for data sources to answer them.  Chances are extremely high that you aren’t collecting all the data you’ll need. The good news is that many organizations are asking the same questions and vendors are beginning to respond with data-driven solutions. For example, perhaps you don’t have to measure all of your industry peers: there are vendors and industry reports offering up answers that you can draw from.

Simple answers are still answers
Start trying to answer your questions with simple counts. Counting things is the first big step in being data-driven and the number of questions that can be answered with a simple count may surprise you. Simple counts paint with large brush strokes and may answer many of your initial questions. But sooner or later, you will want to compare two different counts. Perhaps the comparison will be as simple as comparing one month to the next, but eventually the need to compare two different counts is inevitable. Be forewarned: this comparison is the second big step towards being data-driven. Someone will ask, “Is the difference significant?” and that new question will set you traveling down a path towards statistical thinking. Don’t panic, statistics have already helped shape the evolution of many other fields and resistance is futile.

There is good news and bad news at this point. The good news is that we can get a lot of answers (and therefore benefit) with some relatively entry-level calculations.

The bad news is that the list of questions will grow exponentially as previous ones get answered. “How do I compare to my peers?” and “How can I measure my vendors and third parties?” Congratulations, you are well on your way with a security practice that has data analysis at the heart of the decision-making process!


Jay Jacobs has over 15 years of experience within IT and information security with a focus on cryptography, risk, and data analysis. Most recently, he has joined BitSight Technologies, the Standard in Security Ratings, as their Senior Data Scientist. Previously, he was a Data ... View Full Bio

Recommended Reading:

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Apprentice
8/6/2015 | 12:12:56 PM
The lego block analogy

Good article Jay. Always a useful reminder to break down macro objectives into building block level questions and answers first 

User Rank: Ninja
7/28/2015 | 1:19:07 PM
Occams Razor
To understand something you need to break it down to its most basic components. This is an idea prevalent throughout physics. Relating to "Simple Answers are still answers", using Occam's Razor can definitely help by simplifying procedures such as ones that deal with the filtering of data. Using the fewest assumptions to develop security principles will provide a stronger framework than the oppossite idealogy which could over-complicate.
COVID-19: Latest Security News & Commentary
Dark Reading Staff 9/25/2020
Hacking Yourself: Marie Moe and Pacemaker Security
Gary McGraw Ph.D., Co-founder Berryville Institute of Machine Learning,  9/21/2020
Startup Aims to Map and Track All the IT and Security Things
Kelly Jackson Higgins, Executive Editor at Dark Reading,  9/22/2020
Register for Dark Reading Newsletters
White Papers
Current Issue
Special Report: Computing's New Normal
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
How IT Security Organizations are Attacking the Cybersecurity Problem
How IT Security Organizations are Attacking the Cybersecurity Problem
The COVID-19 pandemic turned the world -- and enterprise computing -- on end. Here's a look at how cybersecurity teams are retrenching their defense strategies, rebuilding their teams, and selecting new technologies to stop the oncoming rise of online attacks.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2020-09-25
In tensorflow-lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, when determining the common dimension size of two tensors, TFLite uses a `DCHECK` which is no-op outside of debug compilation modes. Since the function always returns the dimension of the first tensor, malicious attackers can ...
PUBLISHED: 2020-09-25
In tensorflow-lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, a crafted TFLite model can force a node to have as input a tensor backed by a `nullptr` buffer. This can be achieved by changing a buffer index in the flatbuffer serialization to convert a read-only tensor to a read-write one....
PUBLISHED: 2020-09-25
In tensorflow-lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, if a TFLite saved model uses the same tensor as both input and output of an operator, then, depending on the operator, we can observe a segmentation fault or just memory corruption. We have patched the issue in d58c96946b and ...
PUBLISHED: 2020-09-25
In TensorFlow Lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, saved models in the flatbuffer format use a double indexing scheme: a model has a set of subgraphs, each subgraph has a set of operators and each operator has a set of input/output tensors. The flatbuffer format uses indices f...
PUBLISHED: 2020-09-25
In TensorFlow Lite before versions 2.2.1 and 2.3.1, models using segment sum can trigger writes outside of bounds of heap allocated buffers by inserting negative elements in the segment ids tensor. Users having access to `segment_ids_data` can alter `output_index` and then write to outside of `outpu...