Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Vulnerabilities / Threats

checkLoop 1checkLoop 2checkLoop 3
11/22/2019
10:00 AM
Avidan Avraham
Avidan Avraham
Commentary
Connect Directly
LinkedIn
RSS
E-Mail vvv
50%
50%

The 5-Step Methodology for Spotting Malicious Bot Activity on Your Network

Bot detection over IP networks isn't easy, but it's becoming a fundamental part of network security practice.

With the rise of security breaches using malware, ransomware, and other remote access hacking tools, identifying malicious bots operating on your network has become an essential component to protecting your organization. Bots are often the source of malware, which makes identifying and removing them critical.

But that's easier said than done. Every operating environment has its share of "good" bots, such as software updaters, that are important for good operation. Distinguishing between malicious bots and good bots is challenging. No one variable provides for easy bot classification. Open source feeds and community rules purporting to identify bots are of little help; they contain far too many false positives. In the end, security analysts wind up fighting alert fatigue from analyzing and chasing down all of the irrelevant security alerts triggered by good bots.

At Cato, we faced a similar problem in protecting our customers' networks. To solve the problem, we developed a new, multidimensional approach that identifies 72% more malicious incidents than would have been possible using open source feeds or community rules alone. Best of all, you can implement a similar strategy on your network.

Your tools will be the stock-and-trade of any network engineer: access to your network, a way to capture traffic, like a tap sensor, and enough disk space to store a week's worth of packets. The idea is to gradually narrow the field from sessions generated by people to those sessions likely to indicate a risk to your network. You'll need to:

  • Separate bots from people
  • Distinguish between browsers and other clients
  • Distinguish between bots within browsers
  • Analyze the payload
  • Determine a target's risk

Let's dive into each of those steps.

Separate Bots from People
The first step is to distinguish between bots (good and bad) and humans. We do this by identifying those machines repeatedly communicating with a target. Statistically, the more uniform these communications, the greater the chance that they are generated by a bot.

Distinguish Between Browsers and Other Clients
Having isolated the bots, you then need to look at the initiating client. Typically, "good" bots exist within browsers while "bad" will operate outside of the browser.

Operating systems have different types of clients and libraries generating traffic. For example, "Chrome," "WinInet," and "Java Runtime Environment" are all different client types. At first, client traffic may look the same, but there are some ways to distinguish between clients and enrich our context.

Start by looking at application-layer headers. Because most firewall configurations allow HTTP and TLS to any address, many bots use these protocols to communicate with their targets. You can identify bots operating outside of browsers by identifying groups of client-configured HTTP and TLS features.

Every HTTP session has a set of request headers defining the request and how the server should handle it. These headers, their order, and their values are set when composing the HTTP request. Similarly, TLS session attributes, such as cipher suites, extensions list, ALPN (Application-Layer Protocol Negotiation), and elliptic curves, are established in the initial TLS packet, the "client hello" packet, which is unencrypted. Clustering the different sequences of HTTP and TLS attributes will likely indicate different bots. 

Doing so, for example, will allow you to spot TLS traffic with different cipher suites. It's a good indicator that the traffic is being generated outside of the browser — a very non-humanlike approach and hence a good indicator of bot traffic.

Distinguish Between Bots within Browsers
Another method for identifying malicious bots is to look at specific information contained in HTTP headers. Internet browsers usually have a clear and standard header image. In a normal browsing session, clicking on a link within a browser will generates a "referrer" header that will be included in the next request for that URL. Bot traffic will usually not have a "referrer" header — or worse, it will be forged. Identifying bots that look the same in every traffic flow likely indicates maliciousness.

User-agent is the best-known string representing the program initiating a request. Various sources, such as fingerbank.org, match user-agent values with known program versions. Using this information can help identify abnormal bots. For example, most recent browsers use the "Mozilla 5.0" string in the user-agent field. Seeing a lower version of Mozilla or its complete absence indicates an abnormal bot user-agent string. No trustworthy browser will create traffic without a user-agent value.

Analyze the Payload
Having said that, we don't want to limit our search for "bad" bots only to the HTTP and TLS protocols. We have also observed known malware samples using proprietary unknown protocols over known ports and such could be flagged using application identification.

In addition, the traffic direction (inbound or outbound) has a significant value here. Devices that are connected directly to the Internet are constantly exposed to scanning operations and therefore these bots should be considered as inbound scanners. On the other hand, scanning activity going outbound indicates a device infected with a scanning bot. This could be harmful for the target being scanned and puts the organization IP address reputation at risk.

Determine a Target's Risk
Until now, we've looked for bot indicators in the frequency of client-server communications and in the type of clients. Now, let's pull in another dimension — the destination or target. To determine malicious targets, consider two factors: target reputation and target popularity.

Target reputation calculates the likelihood of a domain being malicious based on the experience gathered form many flows. Reputation is determined either by third-party services or through self-calculation by noting whenever users report a target as malicious.

All too often, though, simple sources for determining targets reputation, such as URL reputation feeds, alone are insufficient. Every month, millions of new domains are registered. With so many new domains, domain reputation mechanisms lack sufficient context to categorize them properly, delivering a high rate of false positives.

Bot detection over IP networks is not an easy task, but it's becoming a fundamental part of network security practice and malware hunting specifically. By combining the five techniques we've presented here, you can detect malicious bots more efficiently.

For detailed graphics and practical examples on applying this methodology, go here.

Related Content:

 

Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "What's in a WAF?"

Avidan Avraham is a Security Researcher in Cato Networks. Avidan has a strong interest in cybersecurity, from OS internals and reverse engineering, to network protocols analysis and malicious traffic detection. Avidan is also a big data and machine learning enthusiast who ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
joshuaprice153
50%
50%
joshuaprice153,
User Rank: Apprentice
12/11/2019 | 3:05:11 AM
The 5-Step Methodology for Spotting Malicious Bot Activity on Your Network
Everyone here keeps telling me how great this blog is supposed to be but I don't see any of the fairy dust. The writer needs much room for improvement to compel to read an article of his again. tree service Gainesville
Data Leak Week: Billions of Sensitive Files Exposed Online
Kelly Jackson Higgins, Executive Editor at Dark Reading,  12/10/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: Our Endpoint Protection system is a little outdated... 
Current Issue
The Year in Security: 2019
This Tech Digest provides a wrap up and overview of the year's top cybersecurity news stories. It was a year of new twists on old threats, with fears of another WannaCry-type worm and of a possible botnet army of Wi-Fi routers. But 2019 also underscored the risk of firmware and trusted security tools harboring dangerous holes that cybercriminals and nation-state hackers could readily abuse. Read more.
Flash Poll
Rethinking Enterprise Data Defense
Rethinking Enterprise Data Defense
Frustrated with recurring intrusions and breaches, cybersecurity professionals are questioning some of the industrys conventional wisdom. Heres a look at what theyre thinking about.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-19777
PUBLISHED: 2019-12-13
stb_image.h (aka the stb image loader) 2.23, as used in libsixel and other products, has a heap-based buffer over-read in stbi__load_main.
CVE-2019-19778
PUBLISHED: 2019-12-13
An issue was discovered in libsixel 1.8.2. There is a heap-based buffer over-read in the function load_sixel at loader.c.
CVE-2019-16777
PUBLISHED: 2019-12-13
Versions of the npm CLI prior to 6.13.4 are vulnerable to an Arbitrary File Overwrite. It fails to prevent existing globally-installed binaries to be overwritten by other package installations. For example, if a package was installed globally and created a serve binary, any subsequent installs of pa...
CVE-2019-16775
PUBLISHED: 2019-12-13
Versions of the npm CLI prior to 6.13.3 are vulnerable to an Arbitrary File Write. It is possible for packages to create symlinks to files outside of thenode_modules folder through the bin field upon installation. A properly constructed entry in the package.json bin field would allow a package publi...
CVE-2019-16776
PUBLISHED: 2019-12-13
Versions of the npm CLI prior to 6.13.3 are vulnerable to an Arbitrary File Write. It fails to prevent access to folders outside of the intended node_modules folder through the bin field. A properly constructed entry in the package.json bin field would allow a package publisher to modify and/or gain...
checkLoop 4