Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Threat Intelligence

6/29/2018
01:20 PM
Connect Directly
Twitter
LinkedIn
Google+
RSS
E-Mail
50%
50%

Natural Language Processing Fights Social Engineers

Instead of trying to detect social engineering attacks based on a subject line or URL, a new tool conducts semantic analysis of text to determine malicious intent.

Social engineering is a common problem with few solutions. Now, two researchers are trying to bring down attackers' success rate with a new tool designed to leverage natural language processing (NLP) to detect questions and commands and determine whether they are malicious.

Ian Harris, professor at the University of California, Irvine, and Marcel Carlsson, principal consultant at Lootcore, decided to combat social engineering attacks after many years of friendship and discussions around how effective but poorly researched they were.

"The reason why social engineering has always been an interest … it's sort of the weakest link in any infosec conflict," Carlsson says. "Humans are nice people. They'll usually help you. You can, of course, exploit that or manipulate them into giving you information."

Aside from the detection of email phishing, little progress has been made in stopping the rapid rise and success of social engineering attacks. And it's getting harder for defenders: Adversaries are increasingly better at learning their targets, sending emails that seem legitimate, and integrating outside technologies to make their campaigns more powerful.

Many companies believe new technology is the answer, Carlsson says, and there's often a disproportionate focus on preventing attacks but not detecting and responding to them. Much of the research on social engineering detection has relied on analysis of metadata related to email as an attack vector, including header information and embedded links.

Carlsson and Harris decided to take a different approach and focus on the natural language text within messages. Instead of trying to detect social engineering attacks based on a subject line or URL, they built a tool to conduct semantic analysis of text to determine malicious intent.

Harris, whose research has also focused on hardware design and testing, was using NLP to design hardware components when he recognized its applicability to social engineering defense. "It occurred to me after a while that the best way to understand social engineering attacks was to understand the sentences," he explains.

By focusing on the text itself, this tactic can be used to detect social engineering attacks on non-email attack vectors, including texting applications and chat platforms. With a speech-to-text tool, it also can be used to scan for attacks conducted over the phone or in person.

How It Works
For a social engineering attack to succeed, the actor has to either ask a question whose answer is private or command a target to perform an illicit operation. The researchers' approach detects questions or commands in an email. It flags questions requesting private data and private commands requesting performance of a secure operation.

Their tool doesn't need to know the answer to the question in order to classify it as private, Harris explains. It evaluates statements by using the main verb and object of that verb to summarize their meaning. For example, the command "Send money" would be summed up in the verb-object pair "send, money."

Verb-object pairs are compared with a blacklist of verb-object pairs known to describe forbidden actions. Harris and Carlsson scoured randomly selected phishing emails to identify private questions and commands, taking into consideration synonyms of each word so attacks were not incorrectly classified.

"Part of the difficulty of publishing this type of work is getting example attacks," says Harris, explaining why the pair chose to use phishing emails to inform the blacklist. They have tested their approach with more than 187,000 phishing and non-phishing emails.

Going forward, the team plans to bring their desktop tool to both email and chat clients to scan for social engineering attacks. They also hope to expand their technique to improve on detection for highly individualized attacks, Carlsson adds.

"Phishing emails are generally scattershot – you've gotten these, they're generic for everybody," he explains. "The really personalized and painful attacks are the ones where someone is talking on the phone and they now something about you, so they adjust according to the conversation."

The duo will present their approach to detecting social engineering attacks, and release the tool so attendees can test it, at Black Hat 2018 in a panel entitled "Catch me, Yes we can! Pwning Social Engineers Using Natural Language Processing Techniques in Real-Time."

Related Content:

 

 

 
Black Hat USA returns to Las Vegas with hands-on technical Trainings, cutting-edge Briefings, Arsenal open-source tool demonstrations, top-tier security solutions and service providers in the Business Hall. Click for information on the conference and to register.

Kelly Sheridan is the Staff Editor at Dark Reading, where she focuses on cybersecurity news and analysis. She is a business technology journalist who previously reported for InformationWeek, where she covered Microsoft, and Insurance & Technology, where she covered financial ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 4/7/2020
The Coronavirus & Cybersecurity: 3 Areas of Exploitation
Robert R. Ackerman Jr., Founder & Managing Director, Allegis Capital,  4/7/2020
'Unkillable' Android Malware App Continues to Infect Devices Worldwide
Jai Vijayan, Contributing Writer,  4/8/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: This comment is waiting for review by our moderators.
Current Issue
6 Emerging Cyber Threats That Enterprises Face in 2020
This Tech Digest gives an in-depth look at six emerging cyber threats that enterprises could face in 2020. Download your copy today!
Flash Poll
State of Cybersecurity Incident Response
State of Cybersecurity Incident Response
Data breaches and regulations have forced organizations to pay closer attention to the security incident response function. However, security leaders may be overestimating their ability to detect and respond to security incidents. Read this report to find out more.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-11668
PUBLISHED: 2020-04-09
In the Linux kernel before 5.6.1, drivers/media/usb/gspca/xirlink_cit.c (aka the Xirlink camera USB driver) mishandles invalid descriptors, aka CID-a246b4d54770.
CVE-2020-8961
PUBLISHED: 2020-04-09
An issue was discovered in Avira Free-Antivirus before 15.0.2004.1825. The Self-Protection feature does not prohibit a write operation from an external process. Thus, code injection can be used to turn off this feature. After that, one can construct an event that will modify a file at a specific loc...
CVE-2020-7922
PUBLISHED: 2020-04-09
X.509 certificates generated by the MongoDB Enterprise Kubernetes Operator may allow an attacker with access to the Kubernetes cluster improper access to MongoDB instances. Customers who do not use X.509 authentication, and those who do not use the Operator to generate their X.509 certificates are u...
CVE-2018-21034
PUBLISHED: 2020-04-09
In Argo versions prior to v1.5.0-rc1, it was possible for authenticated Argo users to submit API calls to retrieve secrets and other manifests which were stored within git.
CVE-2020-1895
PUBLISHED: 2020-04-09
A large heap overflow could occur in Instagram for Android when attempting to upload an image with specially crafted dimensions. This affects versions prior to 128.0.0.26.128.