Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Threat Intelligence

8/6/2018
10:20 AM
Connect Directly
Twitter
LinkedIn
RSS
E-Mail
50%
50%

Spot the Bot: Researchers Open-Source Tools to Hunt Twitter Bots

Their goal? To create a means of differentiating legitimate from automated accounts and detail the process so other researchers can replicate it.

What makes Twitter bots tick? Two researchers from Duo Security wanted to find out, so they designed bot-chasing tools and techniques to separate automated accounts from real ones.

Automated Twitter profiles have made headlines for spreading malware and influencing online opinion. Earlier research has dug into the process of creating Twitter datasets and finding potential bots, but none has discussed how researchers can find automated accounts on their own.

Duo's Olabode Anise, data scientist, and Jordan Wright, principal R&D engineer, began their project to learn about how they could pinpoint characteristics of Twitter bots regardless of whether they were harmful. Hackers of all intentions can build bots and use them on Twitter.

The goal was to create a means of differentiating legitimate from automated accounts and detail the process so other researchers can replicate it. They'll present their tactics and findings this week at Black Hat in a session entitled "Don't @ Me: Hunting Twitter Bots at Scale."

Anise and Weight began by compiling and analyzing 88 million Twitter accounts and their usernames, tweet count, followers/following counts, avatar, and description, all of which would serve as a massive dataset in which they could hunt for bots. The data dates from May to July 2018 and was pulled via the Twitter API used to access public data, Wright explains.

"We wanted to make sure we were playing by the rules," Wright notes, since doing otherwise would compromise other researchers' ability to build on their work using the same method. "We're not trying to go around the API and go around limits and tools in place to get more data."

Once they obtained a dataset, the researchers created a "classifier," which detected bots in their massive pool of information by hunting for traits specific to bot accounts. But first they had to determine the details and behaviors that set bots apart.

What Makes Bots Bots?
Indeed, one of the researchers' goals was to learn the key traits of bot accounts, how they are controlled, and how they connect. "The thing about bot accounts is they can come up with identifying characteristics," Anise explains. Traits may change depending on the operator.

Bot accounts are hyperactive: Their likes and retweets are constant throughout the day and into the night. They reply to tweets quickly, Wright says. If a tweet has more than 30 replies within a few seconds, they can deduce bot activity is to blame. An account's number of followers and following can also indicate bot activity depending on when the account was created. If a profile is fairly new and has tens of thousands of followers, it's another suspicious sign.

In their research, Anise and Wright came up with 20 of these defining traits, which also included the number of unique accounts being retweeted, number of tweets with the same content per data, number of daily tweets relative to account age, percentage of retweets with URLs, ratio of tweets with photos vs. text only, number of hashtags per tweet, and distance between geolocated tweets.

Hunting Bots on the Web
The researchers' classifier tool dug through the data and leveraged these filters to detect automated accounts. Once they found initial sets of bots, they took further steps to determine whether the bots were isolated or part of a larger botnet controlled by a single operator.

"We could still use very straightforward characteristics to accurately find new bots," Wright says. "Bots at a larger scale, in general, are using many of the same techniques they have in the past few years." Some bots evolve more quickly than others depending on the operator's goals.

Their tool may have been accurate for this dataset, but Anise says many bot accounts are subtly disguised. Oftentimes accounts appeared to be normal but displayed botlike attributes.

In May, for example, the pair found a cryptocurrency botnet made up of automated accounts, which spoofed legitimate Twitter accounts to spread a giveaway scam. Spoofed accounts had randomly generated usernames and copied legitimate users' photos. They spread spam by replying to real tweets posted by real users, inviting them to join a cryptocurrency giveaway.

The botnet, like many of its kind, used several methods to evade detection. Oftentimes, malicious bots spoof celebrities and high-profile accounts as well as cryptocurrency accounts, edit profile photos to avoid image detection, and use screen names that are typos of real ones. This one went on to impersonate Elon Musk and news organizations such as CNN and Wired.

Joining the Bot Hunters
Anise and Wright are open-sourcing the tools and techniques they used to conduct their research in an effort to help other researchers build on their work and create new methodologies to identify malicious Twitter bots.

"It's a really complex problem," Anise adds. They want to map out their strategy and show how other people can use their work to continue mapping bots and botnet structures.

Related Content:

Kelly Sheridan is the Staff Editor at Dark Reading, where she focuses on cybersecurity news and analysis. She is a business technology journalist who previously reported for InformationWeek, where she covered Microsoft, and Insurance & Technology, where she covered financial ... View Full Bio
 

Recommended Reading:

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Commentary
Ransomware Is Not the Problem
Adam Shostack, Consultant, Entrepreneur, Technologist, Game Designer,  6/9/2021
Edge-DRsplash-11-edge-ask-the-experts
How Can I Test the Security of My Home-Office Employees' Routers?
John Bock, Senior Research Scientist,  6/7/2021
News
New Ransomware Group Claiming Connection to REvil Gang Surfaces
Jai Vijayan, Contributing Writer,  6/10/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win an Amazon Gift Card! Click Here
Latest Comment: Zero Trust doesn't have to break your budget!
Current Issue
The State of Cybersecurity Incident Response
In this report learn how enterprises are building their incident response teams and processes, how they research potential compromises, how they respond to new breaches, and what tools and processes they use to remediate problems and improve their cyber defenses for the future.
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-27610
PUBLISHED: 2021-06-16
SAP NetWeaver ABAP Server and ABAP Platform, versions - 700, 701, 702, 731, 740, 750, 751, 752, 753, 754, 755, 804, does not create information about internal and external RFC user in consistent and distinguished format, which could lead to improper authentication and may be exploited by malicious u...
CVE-2021-34801
PUBLISHED: 2021-06-16
Valine 1.4.14 allows remote attackers to cause a denial of service (application outage) by supplying a ua (aka User-Agent) value that only specifies the product and version.
CVE-2021-34803
PUBLISHED: 2021-06-16
TeamViewer before 14.7.48644 on Windows loads untrusted DLLs in certain situations.
CVE-2020-8299
PUBLISHED: 2021-06-16
Citrix ADC and Citrix/NetScaler Gateway 13.0 before 13.0-76.29, 12.1-61.18, 11.1-65.20, Citrix ADC 12.1-FIPS before 12.1-55.238, and Citrix SD-WAN WANOP Edition before 11.4.0, 11.3.2, 11.3.1a, 11.2.3a, 11.1.2c, 10.2.9a suffers from uncontrolled resource consumption by way of a network-based denial-o...
CVE-2020-8300
PUBLISHED: 2021-06-16
Citrix ADC and Citrix/NetScaler Gateway before 13.0-82.41, 12.1-62.23, 11.1-65.20 and Citrix ADC 12.1-FIPS before 12.1-55.238 suffer from improper access control allowing SAML authentication hijack through a phishing attack to steal a valid user session. Note that Citrix ADC or Citrix Gateway must b...