Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Analytics //

Security Monitoring

6/28/2013
11:37 PM
50%
50%

Machine-Learning Project Sifts Through Big Security Data

As the volume of data created by security and network devices multiplies, researchers look for ways to teach computer to better highlight attack patterns

As an information-security consultant, Alexandre Pinto spent 12 years helping companies set up difficult-to-configure systems to cull security intelligence from logs and security events.

Click here for more of Dark Reading's Black Hat articles.

Yet configuring the systems required months of work and even then needed constant maintenance to enable them to detect the latest threats and pinpoint likely malicious traffic. He realized that while companies may want to monitor their networks for threats, they typically have too few security people to work through data from far too many logs -- a problem that will only get worse as companies seek to sift through more operational data to detect threats. Big data could be the downfall of security if companies don't find better ways of dealing with the growing volumes, he says.

"What chance do we have: We can't find the needle in the haystack as it is now, and now the haystack is 100,000 times larger," he says. "We are going to need help."

To help solve the problem, Pinto has worked during the past six months on a machine-learning system that can take logs and identify traffic that originates from suspicious neighborhoods of the Internet. Dubbed MLSec, the project uses supervised learning algorithms to identify networks that are home to malicious actors. Pinto plans to demonstrate the tool at the Black Hat Security Briefings in July.

The independent researcher started with data from the SANS Institute's DShield project, which gathers firewall logs from participating community members. Pinto trained the system on 1.2 million events from 30 million log entries as well as other data submitted by volunteers. When comparing his results to known blacklists, the machine-learning algorithm appeared to be accurate in 92 to 95 percent of cases. Unlike blacklists, however, the systems does not need to be told which networks are malicious; it creates its own representation of the Internet.

Such a system can help information security managers by more accurately flagging traffic coming from questionable areas of the Internet, says Johannes Ullrich, dean of research for the SANS Technology Institute. In addition, the system can give administrators the best guess of the maliciousness of incoming traffic based on incomplete information, he says.

"It really helps to direct the attention of the security administrator," Ullrich says. "The big-data approach filters the data down to a subset, so you know what is worth looking at."

[Rather than watching for communications between infected systems and command-and-control servers, companies can detect stealthy malware when it attempts to spread. See Researcher To Open-Source Tools For Finding Odd Authentication Behavior.]

For companies with overworked staff, the ability to cull the run-of-the-mill data from the interesting -- potentially malicious -- traffic can be a great benefit. In addition, a machine-learning system can be constructed to adapt far faster than a human as the attackers change their tactics, Pinto argues.

"The model will outperform the expert because the model does not forget the data," he says. "It selectively diminishes the weight of what happened before, as time goes by, but it does not forget it."

Pinto plans to make the system available as a service to anyone to upload their firewall logs. In exchange, the people will get a report that summarizes the findings of the system.

In the end, the more people who use the system, the better the results should be, Pinto says.

"In machine learning, of course, the algorithm is important, but the more data that you throw at it, the better," he says. "This is the perfect fit with data security. The more you are attacked, the better your defenses should get."

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message. Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Mobile Banking Malware Up 50% in First Half of 2019
Kelly Sheridan, Staff Editor, Dark Reading,  1/17/2020
Active Directory Needs an Update: Here's Why
Raz Rafaeli, CEO and Co-Founder at Secret Double Octopus,  1/16/2020
New Attack Campaigns Suggest Emotet Threat Is Far From Over
Jai Vijayan, Contributing Writer,  1/16/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
The Year in Security: 2019
This Tech Digest provides a wrap up and overview of the year's top cybersecurity news stories. It was a year of new twists on old threats, with fears of another WannaCry-type worm and of a possible botnet army of Wi-Fi routers. But 2019 also underscored the risk of firmware and trusted security tools harboring dangerous holes that cybercriminals and nation-state hackers could readily abuse. Read more.
Flash Poll
How Enterprises are Attacking the Cybersecurity Problem
How Enterprises are Attacking the Cybersecurity Problem
Organizations have invested in a sweeping array of security technologies to address challenges associated with the growing number of cybersecurity attacks. However, the complexity involved in managing these technologies is emerging as a major problem. Read this report to find out what your peers biggest security challenges are and the technologies they are using to address them.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-5216
PUBLISHED: 2020-01-23
In Secure Headers (RubyGem secure_headers), a directive injection vulnerability is present in versions before 3.9.0, 5.2.0, and 6.3.0. If user-supplied input was passed into append/override_content_security_policy_directives, a newline could be injected leading to limited header injection. Upon seei...
CVE-2020-5217
PUBLISHED: 2020-01-23
In Secure Headers (RubyGem secure_headers), a directive injection vulnerability is present in versions before 3.8.0, 5.1.0, and 6.2.0. If user-supplied input was passed into append/override_content_security_policy_directives, a semicolon could be injected leading to directive injection. This could b...
CVE-2020-5223
PUBLISHED: 2020-01-23
In PrivateBin versions 1.2.0 before 1.2.2, and 1.3.0 before 1.3.2, a persistent XSS attack is possible. Under certain conditions, a user provided attachment file name can inject HTML leading to a persistent Cross-site scripting (XSS) vulnerability. The vulnerability has been fixed in PrivateBin v1.3...
CVE-2019-20399
PUBLISHED: 2020-01-23
A timing vulnerability in the Scalar::check_overflow function in Parity libsecp256k1-rs before 0.3.1 potentially allows an attacker to leak information via a side-channel attack.
CVE-2020-7915
PUBLISHED: 2020-01-22
An issue was discovered on Eaton 5P 850 devices. The Ubicacion SAI field allows XSS attacks by an administrator.