Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


End of Bibblio RCM includes -->
04:29 PM
Connect Directly

Researchers Create New Approach to Detect Brand Impersonation

A team of Microsoft researchers developed and trained a Siamese Neural Network to detect brand impersonation attacks.

Security researchers have designed a new way to detect brand impersonation using Siamese Neural Networks, which can learn and make predictions based on smaller amounts of data.

Related Content:

State Dept. to Pay Up to $10M for Information on Foreign Cyberattcks

Special Report: Building the SOC of the Future

New From The Edge: 10 Mistakes Companies Make In Their Ransomware Responses

These attacks, in which adversaries craft content to mimic known brands and trick victims into sharing information, have grown harder to detect as technology and techniques improve, says Justin Grana, applied researcher at Microsoft. While business-related applications are most often spoofed in these types of attacks, criminals can forge brand logos for any organization.

"Brand impersonation has increased in its fidelity, in the sense that, at least from a visual [perspective], something that is malicious brand impersonation can look identical to the actual, legitimate content," Grana explains. "There's no more copy-and-paste, or jagged logos." In today's attacks, visual components of brand impersonation almost exactly mimic true content.

This presents a clear security hurdle, he continues, because people and technology can no longer look for artifacts that previously distinguished fake content from the real thing. "Those visual cues are not there anymore," says Grana of a key challenge the research team faced.

Most people are familiar with the concept of image recognition. What makes detecting brand impersonation different is twofold: For one, a victim may receive different types of content that aim to imitate the same brand. An impersonation attack spoofing Microsoft, for example, might send one malicious email that mimics Excel, and another designed to look like Word.

"Those are two very different pieces of content, even though they both represent Microsoft," Grana says.

While too many types of content can present a detection challenge, too few can do the same. Many brands, such as regional banks and other small organizations, aren't often seen in brand impersonation, so there might only be a handful of training examples for a system to learn from.

"The standard deep learning that requires tons and tons of examples per class – class is the brand in this case – really wouldn't work in our situation," he notes.

To address the issue of detecting brand impersonation attacks, Grana teamed up with software engineer Yuchao Dai, software architect Nitin Kumar Goel, and senior applied researcher Jugal Parikh. Together, they developed and trained a Siamese Neural Network on labeled images to detect these types of attacks. Unlike standard deep learning, which is trained on many examples, Siamese Neural Networks are designed to generate better predictions using a smaller number of samples.

[The researchers will discuss their approach, further applications, and planned improvements in their upcoming Black Hat briefing, "Siamese Neural Networks for Detecting Brand Impersonation" on Wednesday, Aug. 4]


The team's dataset consists of more than 50,000 screenshots of malicious login pages spanning more than 1,000 brand impersonations. Each image is a collection of numbers, Grana says, and the team translated those numbers into what he describes a "point" on an N-dimensional coordinate plane. Instead of an image, which has three dimensions of all its different pixels, it becomes numbers. The team sought a way to make the numbers meaningful and in doing so, distinguish fake from real brand images.

"Our algorithm that we used, we rewarded it for … translating content of the same brand to similar numbers, and contents of different brands to different numbers, so that way, when we look at these new numbers that are now meaningful because we trained our network to do so, any numbers that were close together were likely from the same brand," he explains.

Their Siamese Neural Network learns to embed images of the same brand relatively close together in a low-dimensional space, while images of different brands are embedded further apart. They then do a "nearest neighbor classification" in the embedded space.

Training Models, Learning Lessons
Grana says the team faced quite a few challenges and learned some lessons along the way.

"Dealing with skewed data is a large issue," he notes. "When you have a dataset that only has a couple observations per brand or per class, it really does require special techniques. We did some testing with the normal neural network, and it just wasn't sufficient for our purposes."

Determining the specific techniques that will work requires a lot of trial and error, Grana says of the research process. Which method will best suit the data you have? "There's the science behind machine learning, but there is also the art of it, to say, 'which optimization algorithm should we try; which network architecture should we try,'" he explains.

The researchers' work is still ongoing, he adds. Their next goal is to examine how this approach might work with a smart and adaptive adversary, as a means of improving the technology and response to attackers' evolving techniques. The screenshots they used in this research won't be the same ones used in future attacks, and security tech needs to keep pace.

Kelly Sheridan is the Staff Editor at Dark Reading, where she focuses on cybersecurity news and analysis. She is a business technology journalist who previously reported for InformationWeek, where she covered Microsoft, and Insurance & Technology, where she covered financial ... View Full Bio

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
I Smell a RAT! New Cybersecurity Threats for the Crypto Industry
David Trepp, Partner, IT Assurance with accounting and advisory firm BPM LLP,  7/9/2021
Attacks on Kaseya Servers Led to Ransomware in Less Than 2 Hours
Robert Lemos, Contributing Writer,  7/7/2021
It's in the Game (but It Shouldn't Be)
Tal Memran, Cybersecurity Expert, CYE,  7/9/2021
Register for Dark Reading Newsletters
White Papers
Current Issue
Everything You Need to Know About DNS Attacks
It's important to understand DNS, potential attacks against it, and the tools and techniques required to defend DNS infrastructure. This report answers all the questions you were afraid to ask. Domain Name Service (DNS) is a critical part of any organization's digital infrastructure, but it's also one of the least understood. DNS is designed to be invisible to business professionals, IT stakeholders, and many security professionals, but DNS's threat surface is large and widely targeted. Attackers are causing a great deal of damage with an array of attacks such as denial of service, DNS cache poisoning, DNS hijackin, DNS tunneling, and DNS dangling. They are using DNS infrastructure to take control of inbound and outbound communications and preventing users from accessing the applications they are looking for. To stop attacks on DNS, security teams need to shore up the organization's security hygiene around DNS infrastructure, implement controls such as DNSSEC, and monitor DNS traffic
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2023-05-26
Craft is a CMS for creating custom digital experiences. Cross site scripting (XSS) can be triggered by review volumes. This issue has been fixed in version 4.4.7.
PUBLISHED: 2023-05-26
Django-SES is a drop-in mail backend for Django. The django_ses library implements a mail backend for Django using AWS Simple Email Service. The library exports the `SESEventWebhookView class` intended to receive signed requests from AWS to handle email bounces, subscriptions, etc. These requests ar...
PUBLISHED: 2023-05-26
Highlight is an open source, full-stack monitoring platform. Highlight may record passwords on customer deployments when a password html input is switched to `type="text"` via a javascript "Show Password" button. This differs from the expected behavior which always obfuscates `ty...
PUBLISHED: 2023-05-26
Craft is a CMS for creating custom digital experiences on the web.The platform does not filter input and encode output in Quick Post validation error message, which can deliver an XSS payload. Old CVE fixed the XSS in label HTML but didn’t fix it when clicking save. This issue was...
PUBLISHED: 2023-05-26
GDSDB infinite loop in Wireshark 4.0.0 to 4.0.5 and 3.6.0 to 3.6.13 allows denial of service via packet injection or crafted capture file