Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Threat Intelligence

4/21/2020
05:35 PM
100%
0%

Automated Bots Are Increasingly Scraping Data & Attempting Logins

The share of bot traffic to online sites declines, but businesses are seeing an overall increase in automated scraping of data, login attempts, and other detrimental activity.

The volume of Internet traffic due to automated software — bots — has declined to its lowest point in at least six years, but the share of the traffic due to unwanted automated activity — "bad" bots — has increased to its highest level over the same period, according to cybersecurity firm Imperva in a report published on April 21.

In 2019, bad bots accounted for 24% of all Internet traffic seen by Imperva's customers, 5.5 points higher than its lowest level in 2015, the company stated in its "Bad Bot Report 2020." Bad bots are automated software programs that perform unwanted activities, such as scrape price data or availability information from websites, or conduct outright-malicious activities, such as account-takeover attempts or credit card fraud. 

Acceptable bot activity has fallen by nearly two-thirds to 13% of all traffic in 2019, down from 36% in 2014, the report states. The move to a data-driven economy has created an incentive for more bots while at the same time making their activities less acceptable, says Kunal Anand, chief technology officer for Imperva. 

"The digital transformation and the movement of information to the Web is a major driver that makes running bots more lucrative," he says. "There is also increased awareness, and companies are controlling what bots they allow through whitelisting or allow what are seen as good bots."

Bots are a natural evolution of connecting computers and software to the Internet, but they are problematic for companies that have to expose their intellectual property online as part of their business. Airlines, for example, need to give flight information and pricing to customers, but at the same time, a bot-using competitor can scrape that information and gain valuable information. 

Businesses that see Internet efficiencies declining — such as poor conversion rates, content appearing on other sites, or increased failed logins — have likely been targeted by bots, according to Imperva's report.

"The two biggest problems from bad bots are credential stuffing to attack account logins and scraping of data, [such as] pricing and/or content," Anand says. "Almost every website suffers from both of these."

About a quarter of bots are considered simple, with traffic that comes from a single IP address and does not use a browser agent header to pretend that its traffic is legitimate. More-complex bots use browser emulation software, such as Selenium, to masquerade as a legitimate visitor. Selenium is an open source project that is commonly used to test websites for vulnerabilities. The most sophisticated bots move the mouse to mimic human behavior, Imperva states in the report. 

More than 55% of bots impersonated Google's Chrome browser, the highest percentage yet, the company found.

Different industries see different levels of bad bot activity. The financial industry encountered very little "good" bot traffic, with a little less than 48% of traffic due to bad bots and a little more than 51% of traffic from humans. Similarly, the education and IT services sectors are seeing around 45% of their traffic accounted for by bad bots.

Online data firms and business service firms encountered the largest share of traffic from good bots, which accounted for 51% and 54% of their traffic, respectively.

Nearly 46% of unwanted bot traffic came from the United States in 2019, and in many cases, the bot activity is likely legal. In September, a US appeals court upheld the ability of HiQ Labs, a provider of intelligence on employees, to scrape LinkedIn and other services to compile profiles of professionals. 

"Our definition of good bot is typically a tool that the business is willing to allow to be on its site — search engines and SEO tools fall into this list," Anand says. "Companies typically also whitelist other tools that they use themselves, like a vulnerability scanner that they control when it is being deployed. Bad bots are classified as those requests that don't come from a recognized browser and are there for another reason that wasn't authorized by the company."

Related Content

Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "How Can I Help My Users Spot Disinformation?"

Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio
 

Recommended Reading:

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Lex2525
50%
50%
Lex2525,
User Rank: Apprentice
4/25/2020 | 4:58:02 AM
Didn't know about the company Selenium
I can see how the airline industry would be affected negatively by bot software that scans for prices. The same case for other sectors that have significant price option offers like hotels and cruises. The 55% bots showing up in Google Chrome makes a sense given the majority of users use this for their web browser. Well, like all problems come the opportunity for individuals to tackle and execute. And quite interesting to note that majority of the bot traffic is coming from the US and that a good portion of the truck is legal. Thank you for sharing what Selenium is and how it functions to help with testing websites for vulnerabilities. I was completely unaware of this company. <a href="https://www.mesa-carpenter.com"></a>
COVID-19: Latest Security News & Commentary
Dark Reading Staff 6/4/2020
Abandoned Apps May Pose Security Risk to Mobile Devices
Robert Lemos, Contributing Writer,  5/29/2020
How AI and Automation Can Help Bridge the Cybersecurity Talent Gap
Peter Barker, Chief Product Officer at ForgeRock,  6/1/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: What? IT said I needed virus protection!
Current Issue
How Cybersecurity Incident Response Programs Work (and Why Some Don't)
This Tech Digest takes a look at the vital role cybersecurity incident response (IR) plays in managing cyber-risk within organizations. Download the Tech Digest today to find out how well-planned IR programs can detect intrusions, contain breaches, and help an organization restore normal operations.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-13768
PUBLISHED: 2020-06-04
In MiniShare before 1.4.2, there is a stack-based buffer overflow via an HTTP PUT request, which allows an attacker to achieve arbitrary code execution, a similar issue to CVE-2018-19861, CVE-2018-19862, and CVE-2019-17601. NOTE: this product is discontinued.
CVE-2020-13849
PUBLISHED: 2020-06-04
The MQTT protocol 3.1.1 requires a server to set a timeout value of 1.5 times the Keep-Alive value specified by a client, which allows remote attackers to cause a denial of service (loss of the ability to establish new connections), as demonstrated by SlowITe.
CVE-2020-13848
PUBLISHED: 2020-06-04
Portable UPnP SDK (aka libupnp) 1.12.1 and earlier allows remote attackers to cause a denial of service (crash) via a crafted SSDP message due to a NULL pointer dereference in the functions FindServiceControlURLPath and FindServiceEventURLPath in genlib/service_table/service_table.c.
CVE-2020-11682
PUBLISHED: 2020-06-04
Castel NextGen DVR v1.0.0 is vulnerable to CSRF in all state-changing request. A __RequestVerificationToken is set by the web interface, and included in requests sent by web interface. However, this token is not verified by the application: the token can be removed from all requests and the request ...
CVE-2020-12847
PUBLISHED: 2020-06-04
Pydio Cells 2.0.4 web application offers an administrative console named &acirc;&euro;&oelig;Cells Console&acirc;&euro;&#65533; that is available to users with an administrator role. This console provides an administrator user with the possibility of changing several settings, including the applicat...