Analytics // Security Monitoring
3/9/2012
12:53 PM
Connect Directly
RSS
E-Mail
50%
50%
Repost This

A Case Study In Security Big Data Analysis

At the RSA Conference, Zions Bancorporation showed how Hadoop and BI analytics can power better security intelligence

Click here for more articles.

While many RSA attendees had a hard time even figuring out what the heck vendors meant when they referred to "big data" at the show -- and perhaps even the vendors themselves were a bit fuzzy on the definitions -- talk about big data in security wasn't purely hype. In fact, the show acted as the proving grounds for practitioners at one financial institution to show how they’ve been able to use the power of Hadoop-driven clusters and business intelligence tools (BI) to parse more data far more quickly than with traditional SIEM tools.

The result has given that institution, Salt Lake City-based Zions Bancorporation, the ability to come closer to tasting that elusive fruit of the security monitoring world: achieving actionable intelligence on a real-time basis.

According to Preston Wood, CSO at Zions and the moderator of a panel of his Zion team members, the institution has been trying to move to a more data-driven approach to its security practice during the past several years. But it was finding that it was continually running into the limitations of its traditional SIEM tools.

In order to drive deeper forensics and to train statistical machine-learning models, Zions found it needed months or even years of data before it became functionally useful. This quantity of data and the frequency analysis of events was too much for SIEM to handle alone.

“We [knew] we’d be bumping our heads against the ceiling with SIEM fairly early on,” Wood said. “The underlying data technology just couldn’t handle it.”

What’s more, the analysis itself was watery. The team was swimming in data but had a hard time turning that into action.

“The SIEM is good for telling the data what to do,” Wood said. “But who is telling us what to do?”

The pivotal point came with Hadoop, which allowed the company to use data in a new, more effective way. Open-source Hadoop, when coupled with Google’s MapReduce, has made life much different for Zions.

“The crux of the system is the distributed file system,” said Mike Fowkes, director of fraud prevention and analytics for Zions. The file system makes it easy for administrators to run Java-based queries that will then run against data spread across multiple systems. This allows more timely analysis of a greater sum of data than was before possible.

Zions’ results have been dramatic. In an environment where its security systems generate 3 terabytes of data a week, just loading the previous day’s logs into the system can be a challenge. It used to take a full day, Foust said.

“With MapReduce, HIVE, and Hadoop, we’re doing it in near-real-time fashion,” he said. “We’re pulling in data every five minutes, hourly, every two minutes -- it just depends on the frequency of how fresh our data needs to be.”

And actual searches can be even more dramatically fast. Searching among a month’s load of logs could take anywhere between 20 minutes to an hour depending on how busy the server was, he said.

“In our environment within HIVE, it has been more like a minute to get the same deal,” Fowkes said.

Aside from a boost in data-mining firepower, Hadoop’s HDFS file system brings a robust level of availability to the data warehouse environment, too.

“If you’re running a job and something fails on a system, it will dynamically readjust,” said Fowkes, explaining that a failure of a node or a hard drive isn’t the show-stopper it used to be. Instead, the system is able to reapportion the data based on the number of remaining nodes.

With a fast and effective infrastructure set up and running, Zions uses the data for dozens of purposes. Database logs, firewall, antivirus, IDS logs, plus industry-specific logs like wire ACS deposit applications and credit data are all pulled together into a centralized syslog server.

While queries are written in Java, it takes more than an off-the-shelf Java programmer to put together meaningful queries and make sense of what they return. That’s where Aaron Caldiero comes in. As senior data scientist at Zions, he plays the part of “part computer scientist, part statistician, and part graphic designer,” he explains.

Caldiero's job is to collect and centralize the data, design methods of synthesizing it (ranging from basic logic to machine-learning algorithms), and then present it in a coherent way.

His approach has achieved incredible results for his organizations, but it may be foreign for security professionals.

“It’s a bottom-up process where you’re putting the data first,” Caldiero said.

Compiling huge amounts of data allows analysts to draw trends, patterns, or correlations that they might never have found had they put the questions first and sorted through terabytes of data for the answers.

It’s an approach that has worked well for Zion and Wood, and his team believes it could be well-applied elsewhere. Wood stressed that the power of big data analytics isn’t just for big companies, either.

“You can start with a single box in your environment,” he said, stressing that it is a technology well-suited for security, but the expectation needs to be set that “big data strategy is a journey, not a destination. It’s not a product you’re going to buy; it’s not something you’re going to stand up there and be done with.”

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
JCharles
50%
50%
JCharles,
User Rank: Apprentice
1/21/2013 | 4:30:58 PM
re: A Case Study In Security Big Data Analysis
Most organizations would like to do Big Data Mining & SIEM but they can't afford lengthy & costly Hadoop developments. But there are working solutions out there like Secnology.
Register for Dark Reading Newsletters
White Papers
Flash Poll
Current Issue
Video
Slideshows
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2013-5704
Published: 2014-04-15
The mod_headers module in the Apache HTTP Server 2.2.22 allows remote attackers to bypass "RequestHeader unset" directives by placing a header in the trailer portion of data sent with chunked transfer coding. NOTE: the vendor states "this is not a security issue in httpd as such."

CVE-2013-5705
Published: 2014-04-15
apache2/modsecurity.c in ModSecurity before 2.7.6 allows remote attackers to bypass rules by using chunked transfer coding with a capitalized Chunked value in the Transfer-Encoding HTTP header.

CVE-2014-0341
Published: 2014-04-15
Multiple cross-site scripting (XSS) vulnerabilities in PivotX before 2.3.9 allow remote authenticated users to inject arbitrary web script or HTML via the title field to (1) templates_internal/pages.tpl, (2) templates_internal/home.tpl, or (3) templates_internal/entries.tpl; (4) an event field to ob...

CVE-2014-0342
Published: 2014-04-15
Multiple unrestricted file upload vulnerabilities in fileupload.php in PivotX before 2.3.9 allow remote authenticated users to execute arbitrary PHP code by uploading a file with a (1) .php or (2) .php# extension, and then accessing it via unspecified vectors.

CVE-2014-0348
Published: 2014-04-15
The Artiva Agency Single Sign-On (SSO) implementation in Artiva Workstation 1.3.x before 1.3.9, Artiva Rm 3.1 MR7, Artiva Healthcare 5.2 MR5, and Artiva Architect 3.2 MR5, when the domain-name option is enabled, allows remote attackers to login to arbitrary domain accounts by using the corresponding...

Best of the Web