Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Vulnerabilities / Threats

11/22/2019
10:00 AM
Avidan Avraham
Avidan Avraham
Commentary
Connect Directly
LinkedIn
RSS
E-Mail vvv
50%
50%

The 5-Step Methodology for Spotting Malicious Bot Activity on Your Network

Bot detection over IP networks isn't easy, but it's becoming a fundamental part of network security practice.

With the rise of security breaches using malware, ransomware, and other remote access hacking tools, identifying malicious bots operating on your network has become an essential component to protecting your organization. Bots are often the source of malware, which makes identifying and removing them critical.

But that's easier said than done. Every operating environment has its share of "good" bots, such as software updaters, that are important for good operation. Distinguishing between malicious bots and good bots is challenging. No one variable provides for easy bot classification. Open source feeds and community rules purporting to identify bots are of little help; they contain far too many false positives. In the end, security analysts wind up fighting alert fatigue from analyzing and chasing down all of the irrelevant security alerts triggered by good bots.

At Cato, we faced a similar problem in protecting our customers' networks. To solve the problem, we developed a new, multidimensional approach that identifies 72% more malicious incidents than would have been possible using open source feeds or community rules alone. Best of all, you can implement a similar strategy on your network.

Your tools will be the stock-and-trade of any network engineer: access to your network, a way to capture traffic, like a tap sensor, and enough disk space to store a week's worth of packets. The idea is to gradually narrow the field from sessions generated by people to those sessions likely to indicate a risk to your network. You'll need to:

  • Separate bots from people
  • Distinguish between browsers and other clients
  • Distinguish between bots within browsers
  • Analyze the payload
  • Determine a target's risk

Let's dive into each of those steps.

Separate Bots from People
The first step is to distinguish between bots (good and bad) and humans. We do this by identifying those machines repeatedly communicating with a target. Statistically, the more uniform these communications, the greater the chance that they are generated by a bot.

Distinguish Between Browsers and Other Clients
Having isolated the bots, you then need to look at the initiating client. Typically, "good" bots exist within browsers while "bad" will operate outside of the browser.

Operating systems have different types of clients and libraries generating traffic. For example, "Chrome," "WinInet," and "Java Runtime Environment" are all different client types. At first, client traffic may look the same, but there are some ways to distinguish between clients and enrich our context.

Start by looking at application-layer headers. Because most firewall configurations allow HTTP and TLS to any address, many bots use these protocols to communicate with their targets. You can identify bots operating outside of browsers by identifying groups of client-configured HTTP and TLS features.

Every HTTP session has a set of request headers defining the request and how the server should handle it. These headers, their order, and their values are set when composing the HTTP request. Similarly, TLS session attributes, such as cipher suites, extensions list, ALPN (Application-Layer Protocol Negotiation), and elliptic curves, are established in the initial TLS packet, the "client hello" packet, which is unencrypted. Clustering the different sequences of HTTP and TLS attributes will likely indicate different bots. 

Doing so, for example, will allow you to spot TLS traffic with different cipher suites. It's a good indicator that the traffic is being generated outside of the browser — a very non-humanlike approach and hence a good indicator of bot traffic.

Distinguish Between Bots within Browsers
Another method for identifying malicious bots is to look at specific information contained in HTTP headers. Internet browsers usually have a clear and standard header image. In a normal browsing session, clicking on a link within a browser will generates a "referrer" header that will be included in the next request for that URL. Bot traffic will usually not have a "referrer" header — or worse, it will be forged. Identifying bots that look the same in every traffic flow likely indicates maliciousness.

User-agent is the best-known string representing the program initiating a request. Various sources, such as fingerbank.org, match user-agent values with known program versions. Using this information can help identify abnormal bots. For example, most recent browsers use the "Mozilla 5.0" string in the user-agent field. Seeing a lower version of Mozilla or its complete absence indicates an abnormal bot user-agent string. No trustworthy browser will create traffic without a user-agent value.

Analyze the Payload
Having said that, we don't want to limit our search for "bad" bots only to the HTTP and TLS protocols. We have also observed known malware samples using proprietary unknown protocols over known ports and such could be flagged using application identification.

In addition, the traffic direction (inbound or outbound) has a significant value here. Devices that are connected directly to the Internet are constantly exposed to scanning operations and therefore these bots should be considered as inbound scanners. On the other hand, scanning activity going outbound indicates a device infected with a scanning bot. This could be harmful for the target being scanned and puts the organization IP address reputation at risk.

Determine a Target's Risk
Until now, we've looked for bot indicators in the frequency of client-server communications and in the type of clients. Now, let's pull in another dimension — the destination or target. To determine malicious targets, consider two factors: target reputation and target popularity.

Target reputation calculates the likelihood of a domain being malicious based on the experience gathered form many flows. Reputation is determined either by third-party services or through self-calculation by noting whenever users report a target as malicious.

All too often, though, simple sources for determining targets reputation, such as URL reputation feeds, alone are insufficient. Every month, millions of new domains are registered. With so many new domains, domain reputation mechanisms lack sufficient context to categorize them properly, delivering a high rate of false positives.

Bot detection over IP networks is not an easy task, but it's becoming a fundamental part of network security practice and malware hunting specifically. By combining the five techniques we've presented here, you can detect malicious bots more efficiently.

For detailed graphics and practical examples on applying this methodology, go here.

Related Content:

 

Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "What's in a WAF?"

Avidan Avraham is a Security Researcher in Cato Networks. Avidan has a strong interest in cybersecurity, from OS internals and reverse engineering, to network protocols analysis and malicious traffic detection. Avidan is also a big data and machine learning enthusiast who ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
SOC 2s & Third-Party Assessments: How to Prevent Them from Being Used in a Data Breach Lawsuit
Beth Burgin Waller, Chair, Cybersecurity & Data Privacy Practice , Woods Rogers PLC,  12/5/2019
Navigating Security in the Cloud
Diya Jolly, Chief Product Officer, Okta,  12/4/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: "This is the last time we hire Game of Thrones Security"
Current Issue
Navigating the Deluge of Security Data
In this Tech Digest, Dark Reading shares the experiences of some top security practitioners as they navigate volumes of security data. We examine some examples of how enterprises can cull this data to find the clues they need.
Flash Poll
Rethinking Enterprise Data Defense
Rethinking Enterprise Data Defense
Frustrated with recurring intrusions and breaches, cybersecurity professionals are questioning some of the industrys conventional wisdom. Heres a look at what theyre thinking about.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-4428
PUBLISHED: 2019-12-09
IBM Watson Assistant for IBM Cloud Pak for Data 1.0.0 through 1.3.0 is vulnerable to cross-site scripting. This vulnerability allows users to embed arbitrary JavaScript code in the Web UI thus altering the intended functionality potentially leading to credentials disclosure within a trusted session....
CVE-2019-4611
PUBLISHED: 2019-12-09
IBM Planning Analytics 2.0 is vulnerable to cross-site scripting. This vulnerability allows users to embed arbitrary JavaScript code in the Web UI thus altering the intended functionality potentially leading to credentials disclosure within a trusted session. IBM X-Force ID: 168519.
CVE-2019-4612
PUBLISHED: 2019-12-09
IBM Planning Analytics 2.0 is vulnerable to malicious file upload in the My Account Portal. Attackers can make use of this weakness and upload malicious executable files into the system and it can be sent to victim for performing further attacks. IBM X-Force ID: 168523.
CVE-2019-4621
PUBLISHED: 2019-12-09
IBM DataPower Gateway 7.6.0.0-7 throug 6.0.14 and 2018.4.1.0 through 2018.4.1.5 have a default administrator account that is enabled if the IPMI LAN channel is enabled. A remote attacker could use this account to gain unauthorised access to the BMC. IBM X-Force ID: 168883.
CVE-2019-19230
PUBLISHED: 2019-12-09
An unsafe deserialization vulnerability exists in CA Release Automation (Nolio) 6.6 with the DataManagement component that can allow a remote attacker to execute arbitrary code.