Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Cloud

9/10/2019
10:00 AM
Howie Xu
Howie Xu
Commentary
Connect Directly
LinkedIn
RSS
E-Mail vvv
100%
0%

AI Is Everywhere, but Don't Ignore the Basics

Artificial intelligence is no substitute for common sense, and it works best in combination with conventional cybersecurity technology. Here are the basic requirements and best practices you need to know.

The fourth industrial revolution is here, and experts anticipate organizations will continue to embrace artificial intelligence (AI) and machine learning (ML) technologies. A forecast by IDC indicates spending on AI/ML will reach $35.8 billion this year and hit $79.2 billion by 2022. Though the principles of the technology have been around for decades, the more recent mass adoption of cloud computing and the flood of big data has made the concept a reality. 

The result? Companies based around software-as-a-service are best positioned to take advantage of AI/ML because cloud and data are second nature to them. 

In the past five years alone, AI/ML went from technology that showed lots of promise to one that delivers on that promise because of the convergence of easy access to inexpensive cloud computing and the integration of large data sets. AI and ML have already begun to see acceleration for cybersecurity uses. Dealing with mountains of data that only continues to grow, machines that analyze data bring immense value to security teams: They can operate 24/7 and humans can't. 

For your cybersecurity team to effectively launch AI/ML, be sure these three requirements are in place:

1. Data: If AI/ML is a rocket, data is the fuel. AI/ML requires massive amounts of data to help it train models that can do classifications and predictions with high accuracy. Generally, the more data that goes through the AI/ML system, the better the outcome.

2. Data science and data engineering: Data scientists and data engineers must be able to understand the data, sanitize it, extract it, transform it, load it, choose the right models and right features, engineer the features appropriately, measure the model appropriately, and update the model whenever needed.

3. Domain experts: They play an essential role in constructing an organization's dataset, identifying what is good and what is bad and providing insights into how this determination was made. This is often the aspect that gets overlooked when it comes to AI/ML.

Once you have these three requirements, the engineering and analytics teams can move to solving very specific problems. Here are three categories, for example:

1. Security user risk analysis: Just like a credit score, you can come up with a risk score based on a user behavior — and with AI/ML, you can now scale it for a very large-scale users.

2. Data exfiltration: With AI/ML, you'll be able to identify patterns more readily — what's normal, what's abnormal. 

3. Content classification: Variants on web pages, ransomware strains, destination, and more. 

Adopting AI/ML in your cybersecurity measures requires you to think differently, plan and pace the project differently, but it doesn't replace common sense and some of the conventional best practices. AI/ML is not a substitute for having a layered security defense, either. In fact, we've seen that AI/ML has been doing far better when combined with traditional cybersecurity technology. 

Here are three tenets to execute an AI/ML project:

1. "Not all data can be treated equal." Enterprise data has custom privacy and access control requirements; the data often is spread around different departments and encoded with a long history of "tribal knowledge."

2. "Wars have been won or lost primarily because of logistics," as noted by General Eisenhower. In the context of the AI/ML battleground, the logistics is the data and model pipeline. Without an automated and flexible data and model pipeline, you may win one battle here and there but will likely lose the war.

3. "It takes a village" to raise a successful AI/ML project. Data scientists need to have tight alignment with domain experts, data engineers, and businesspeople.

In the past, there have been two main criticisms of AI/ML: 1) AI is a black box, so it's hard for security practitioners to explain the results, and 2) AI/ML has too many false positives (that is, false alarms). But by combining AI/ML and tried-and-true conventional cybersecurity technology, AI/ML is more explainable, and you get fewer false positives than with conventional technology alone.

AI/ML already proved it can help businesses in a number of ways, but it still lacks context, common sense, and human awareness. That's the next step toward perfecting the technology. In the meantime, cybersecurity defense still requires domain experts, but now these experts are helping shape the future with a new paradigm shift for AI/ML methodology.

Related Content:

 

Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "Phishers' Latest Tricks for Reeling in New Victims."

Howie Xu is Vice President of AI and Machine Learning at Zscaler. He was the CEO and Co-Founder of TrustPath, which was acquired by Zscaler in 2018. Howie was formerly an EIR with Greylock Partners and the founder and head of the VMware networking unit. View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
tdsan
50%
50%
tdsan,
User Rank: Ninja
9/25/2019 | 9:30:44 PM
Re: adversarial attacks


 

It is going to be interesting to see once the market is flooded with companies, will we see improvement or will the learning process taper off after improvements to the algorithms or as it relates to tainted data. I am optimistic but only time will tell if companies like Sophos (Intercept X), Carbon-Black (with BluVector) will be able to look at datastreams and determine if that actor is just making a mistake or if it is an elaborate attack that is taking place overtime where the actor is using AI to find weaknesses.

Now that will be interesting.

T

 
howie.xu
100%
0%
howie.xu,
User Rank: Apprentice
9/25/2019 | 7:59:11 PM
adversarial attacks

Well aware of this paper.  Adversarial attacks will come to cybersecurity world more and more.  For instance, an attacker may modify a malicious activity so that its badness is preserved but can fool an ML model into thinking it's all legitimate/benign.  Zscaler being the leader in cloud security is leading the R&D in this area and we welcome top researchers and engineers out there to join our extremely interesting and rewarding journey! :)

 

-Howie Xu

 

tdsan
50%
50%
tdsan,
User Rank: Ninja
9/25/2019 | 7:08:32 PM
Re: Key points that were left out
When you get a chance, check out this article, it elaborates on the discussions we had about AI/ML.

They cover the examples you and I brought up in the discussions, it seems it just takes a small adjustment and the data is tainted, so to me that is not real AI but ML. Once AI becomes self-aware, then these problems will be a thing of the past, but there could be other things we need to address.

Todd

 
howie.xu
100%
0%
howie.xu,
User Rank: Apprentice
9/25/2019 | 7:00:37 PM
Re: Key points that were left out
Hi Todd, very true.  I didn't elaborate in this article but your point about data is very valid.  That's why my top best practice is about "not all data is created equal".  :)

Data has privacy issues, and then data quanity (volume/processing capacity) and data quality (for instance, what data can be used for what use case) issues too. The list goes on. :)

 

cheers,

 

-Howie
tdsan
100%
0%
tdsan,
User Rank: Ninja
9/25/2019 | 6:49:04 PM
Re: Key points that were left out
Yes, there is no silver-bullet, it is still a work in progress but we have to continue to move forward because the future seems to be getting brighter and brighter (or the outcomes I should say).

Of course, in the security realm, laying solutions to make it harder for the assailant to penetrate your defenses is common-sense (onion and layered approach).



And yes, I do agree, that it is going to take time for AI to make decisions that are indicative of our expected outcomes, but I am curious about the validity of data and if that data is tainted in any way (biases), the results of AI could be skewed to affect the personal lives where it has been trained (like going into neighborhoods and opening fire on people of color, possibility). I would think we need to be able to filter data that is considered way out of the normal parameters, that is up for discussion. There will be one-offs.

T

 

 
howie.xu
100%
0%
howie.xu,
User Rank: Apprentice
9/25/2019 | 3:15:23 PM
Re: Key points that were left out
Hi Todd, I appreciate your detailed feedback, compliments, comments, and questions.

 

AI/ML can help identify ""what's normal, what's abnormal" faster but then the truth is "abnormal" does not equal "malicious", as you probabaly meant to express too.

 

There is no silver bullet yet, AI/ML is help to solve a large scale problem, but one Machine Learning model often is not enough.  Often time, you need multiple models emsembled together, and you sometimes need heuristics to come to help too.

It is naive to think one machine learning model can detect anomaly and hence bad/malicious behavior, but it is reasonabe to think one machine learning model can be one of the critical pieces of the puzzle.

 

Hope it helps,

 

-howie
tdsan
100%
0%
tdsan,
User Rank: Ninja
9/12/2019 | 1:49:48 PM
Key points that were left out

1. Data: If AI/ML is a rocket, data is the fuel. AI/ML requires massive amounts of data to help it train models that can do classifications and predictions with high accuracy. Generally, the more data that goes through the AI/ML system, the better the outcome.

 I like the fact that you prefaced the statement with generally and in section 3 you addressed it quite nicely.

3. Domain experts: They play an essential role in constructing an organization's dataset, identifying what is good and what is bad and providing insights into how this determination was made. This is often the aspect that gets overlooked when it comes to AI/ML.

I do like the fact that you mentioned "what's normal, what's abnormal.". Now this statement, I am not so sure of because if we consider what is outside the various thresholds, in the human world, we have to take into consideration time or one offs. What if someone forgot to do something and they ran a task, that task was in the middle of the day but it was to go out, run a report and provide that report to the mgmt staff (that is not part of the norm from a business process standpoint but it is within the norm of normal business operations). The AI/ML could identify this task as being a threat.


However, I do like this statement you wrote, very perceptive:

2. "Wars have been won or lost primarily because of logistics," as noted by General Eisenhower. In the context of the AI/ML battleground, the logistics is the data and model pipeline. Without an automated and flexible data and model pipeline, you may win one battle here and there but will likely lose the war.


I would think it is the processes and planning that create the data (the logistics) and the pipeline is considered how the data is transferred, executed and delivered to right people at the right time, this is truly how wars are won.

"The more you sweat in peace, the less you bleed in war." - General Schwarzkopf


The details (data), planning (process) and execution (pipeline) are the key elements that are used to effectively address the issues that we see every day. The only time we are even close to winning this war on cyber-terror is when we start looking at people as human-beings and provide a roadmap to respect even the menial garbage worker, because no criminal (there are outliers) wants to remain in the same position in which they started.


Todd
sama174
100%
0%
sama174,
User Rank: Apprentice
9/11/2019 | 2:17:10 AM
Education
I really appreciate this wonderful post that you have provided for us. I assure this would be beneficial for most of the people. <a href="https://www.excelr.com/data-science-course-training-in-hyderabad/"> Data Science in Hyderabad </a>
Why Cyber-Risk Is a C-Suite Issue
Marc Wilczek, Digital Strategist & CIO Advisor,  11/12/2019
Black Hat Q&A: Hacking a '90s Sports Car
Black Hat Staff, ,  11/7/2019
The Cold Truth about Cyber Insurance
Chris Kennedy, CISO & VP Customer Success, AttackIQ,  11/7/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
7 Threats & Disruptive Forces Changing the Face of Cybersecurity
This Dark Reading Tech Digest gives an in-depth look at the biggest emerging threats and disruptive forces that are changing the face of cybersecurity today.
Flash Poll
Rethinking Enterprise Data Defense
Rethinking Enterprise Data Defense
Frustrated with recurring intrusions and breaches, cybersecurity professionals are questioning some of the industrys conventional wisdom. Heres a look at what theyre thinking about.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-18954
PUBLISHED: 2019-11-14
Pomelo v2.2.5 allows external control of critical state data. A malicious user input can corrupt arbitrary methods and attributes in template/game-server/app/servers/connector/handler/entryHandler.js because certain internal attributes can be overwritten via a conflicting name. Hence, a malicious at...
CVE-2019-3640
PUBLISHED: 2019-11-14
Unprotected Transport of Credentials in ePO extension in McAfee Data Loss Prevention 11.x prior to 11.4.0 allows remote attackers with access to the network to collect login details to the LDAP server via the ePO extension not using a secure connection when testing LDAP connectivity.
CVE-2019-3661
PUBLISHED: 2019-11-14
Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection') in McAfee Advanced Threat Defense (ATD) prior to 4.8 allows remote authenticated attacker to execute database commands via carefully constructed time based payloads.
CVE-2019-3662
PUBLISHED: 2019-11-14
Path Traversal: '/absolute/pathname/here' vulnerability in McAfee Advanced Threat Defense (ATD) prior to 4.8 allows remote authenticated attacker to gain unintended access to files on the system via carefully constructed HTTP requests.
CVE-2019-3663
PUBLISHED: 2019-11-14
Unprotected Storage of Credentials vulnerability in McAfee Advanced Threat Defense (ATD) prior to 4.8 allows local attacker to gain access to the root password via accessing sensitive files on the system.