6 Steps for Applying Data Science to Security

Two experts share their data science know-how in a tutorial focusing on internal DNS query analysis.
1. Define the business problem
2. Decide what data sources would be best to solve the problem
3. Take an inventory of the data
4. Experiment with many data science techniques
5. Test for a real-world perspective
6. Follow-up and continuous improvement

Security practitioners are being told that they have to get smarter about how they use data. The problem is that many data scientists are lost in their world of math and algorithms and don’t always explain the value they bring from a business perspective.

Dr. Kenneth Sanford, analytics architect and sales engineering lead at Dataiku, says security pros have to work more closely with data scientists to understand what the business is trying to accomplish. For example, is compliance the goal? Or is the company looking to determine what it might cost if they experienced a ransomware attack?

"It’s really important to define the business problem," Sanford says. "Something like what downtime would cost the business, or what the monetary fine would be if the company were out of compliance."

Bob Rudis, chief data scientist at Rapid7, adds that companies need to take a step back and look at their processes and decide what could be done better via data science.

"Companies need to ask themselves how the security problem is associated with the business problem," Rudis says.

Sanford and Rudis created a six-step process for how to build a model to analyze internal DNS queries – the goal of which would be to reduce or eliminate malicious code from the queries. 

Next slide
Recommended Reading: