Natural Language Processing Fights Social Engineers

Threat Intelligence

Researchers used natural language processing to detect malicious content in more than 187,000 phishing and non-phishing emails.

Kelly Sheridan, Former Senior Editor, Dark Reading

June 29, 2018

4 Min Read

Social engineering is a common problem with few solutions. Now, two researchers are trying to bring down attackers' success rate with a new tool designed to leverage natural language processing (NLP) to detect questions and commands and determine whether they are malicious.

Ian Harris, professor at the University of California, Irvine, and Marcel Carlsson, principal consultant at Lootcore, decided to combat social engineering attacks after many years of friendship and discussions around how effective but poorly researched they were.

"The reason why social engineering has always been an interest … it's sort of the weakest link in any infosec conflict," Carlsson says. "Humans are nice people. They'll usually help you. You can, of course, exploit that or manipulate them into giving you information."

Aside from the detection of email phishing, little progress has been made in stopping the rapid rise and success of social engineering attacks. And it's getting harder for defenders: Adversaries are increasingly better at learning their targets, sending emails that seem legitimate, and integrating outside technologies to make their campaigns more powerful.

Many companies believe new technology is the answer, Carlsson says, and there's often a disproportionate focus on preventing attacks but not detecting and responding to them. Much of the research on social engineering detection has relied on analysis of metadata related to email as an attack vector, including header information and embedded links.

Carlsson and Harris decided to take a different approach and focus on the natural language text within messages. Instead of trying to detect social engineering attacks based on a subject line or URL, they built a tool to conduct semantic analysis of text to determine malicious intent.

Harris, whose research has also focused on hardware design and testing, was using NLP to design hardware components when he recognized its applicability to social engineering defense. "It occurred to me after a while that the best way to understand social engineering attacks was to understand the sentences," he explains.

By focusing on the text itself, this tactic can be used to detect social engineering attacks on non-email attack vectors, including texting applications and chat platforms. With a speech-to-text tool, it also can be used to scan for attacks conducted over the phone or in person.

How It Works
For a social engineering attack to succeed, the actor has to either ask a question whose answer is private or command a target to perform an illicit operation. The researchers' approach detects questions or commands in an email. It flags questions requesting private data and private commands requesting performance of a secure operation.

Their tool doesn't need to know the answer to the question in order to classify it as private, Harris explains. It evaluates statements by using the main verb and object of that verb to summarize their meaning. For example, the command "Send money" would be summed up in the verb-object pair "send, money."

Verb-object pairs are compared with a blacklist of verb-object pairs known to describe forbidden actions. Harris and Carlsson scoured randomly selected phishing emails to identify private questions and commands, taking into consideration synonyms of each word so attacks were not incorrectly classified.

"Part of the difficulty of publishing this type of work is getting example attacks," says Harris, explaining why the pair chose to use phishing emails to inform the blacklist. They have tested their approach with more than 187,000 phishing and non-phishing emails.

Going forward, the team plans to bring their desktop tool to both email and chat clients to scan for social engineering attacks. They also hope to expand their technique to improve on detection for highly individualized attacks, Carlsson adds.

"Phishing emails are generally scattershot – you've gotten these, they're generic for everybody," he explains. "The really personalized and painful attacks are the ones where someone is talking on the phone and they now something about you, so they adjust according to the conversation."

The duo will present their approach to detecting social engineering attacks, and release the tool so attendees can test it, at Black Hat 2018 in a panel entitled "Catch me, Yes we can! Pwning Social Engineers Using Natural Language Processing Techniques in Real-Time."

About the Author(s)

Kelly Sheridan

Former Senior Editor, Dark Reading

Kelly Sheridan was formerly a Staff Editor at Dark Reading, where she focused on cybersecurity news and analysis. She is a business technology journalist who previously reported for InformationWeek, where she covered Microsoft, and Insurance & Technology, where she covered financial services. Sheridan earned her BA in English at Villanova University. You can follow her on Twitter @kellymsheridan.

See more from Kelly Sheridan

Related Topics

Related Topics

Related Topics

Related Topics

About the Author(s)

Editor's Choice