Mike talked about the first part of what most people see as a timeline for dealing with malware: ideally, you should be detecting it early so that you can stop it before it reaches a vulnerable target. The follow-on to that, of course, is not to have any vulnerable targets to hit. That's the Sisyphean task that makes being a CISO such an unpopular career choice: "Wanted: one martyr with 10 years of experience, to create and maintain perfect 24/7 defenses, despite the best efforts of all other stakeholders, without pissing them off. If you fail big enough, we'll send you packing."
So it's not too far of a stretch to say, as Richard Bejtlich has for years, that Prevention Eventually Fails. And it may be FUD to say that everyone is probably already compromised to some extent, but sometimes the only difference between FUD and reality is in how you present it. Either way, you need detection just as much as prevention, and believe it or not, this is the hard part.
As Mike pointed out, detecting evidence of malware that has already landed means looking for automated activity on the network. This activity can take the form of phoning home to a command and control unit, performing further reconnaissance through scanning and discovery, or exfiltrating data. It can also involve looking for configuration changes and the presence of artifacts on disks and in memory, so you should not assume that you're covered if all you do is network-based monitoring.
This may seem so basic as to be obvious, but when you're doing network-based malware detection, you're looking for current activity. This is not as easy as you'd think, because attackers have gotten very good at hiding it: disguising their traffic, obfuscating it, hitching a ride on legitimate traffic, using routes and protocols that you'd never suspect, and spreading it out over time so that you're less likely to piece it together. Malware has gotten so clever, for example, that it can find out where to phone home by executing a Google search on a specific term that will bring up a sponsored listing or ad on the sidebar of the results page that links to the instructions. Got all that? Yes, the attackers are fiendishly clever.
Because of this, after-the-fact malware detection on the network has to cover a wide range of data points. It needs to have historical information on what happened previously so that it can catch those dormant infections when they finally send out a beacon ("Give me a ping, Vasily – one ping only, please"). It also needs to understand baseline network traffic to spot anomalies, and crack open encrypted SSL traffic where possible to read the contents. Just figuring out what counts as anomalous traffic is a PhD dissertation in and of itself.
Of course, it also has to be able to look for what we already know is probably bad: signs and symptoms that have been seen before. And if you've been following along up to this point, you may think – as I always start thinking – "signature." But wait! Signatures are bad, right? They're useless for malware detection! Poor signatures have gotten beaten on so much that nobody wants to say the S-word any more. People would rather say "rules," "blacklists," "heuristics," "algorithms," or even "indicators of compromise" – and yes, all of these have differences from the definition of "signature," but when you break it down, you're still looking for something based on characteristics that you already know about.
Threat intelligence has blossomed as a market, and it's built into just about everything. It's the result of the hard work from carbon-based life forms who disassemble malware samples, profile the attackers who write and use the malware, and listen to Internet chatter. Turning intelligence into something that can tell you what to look for is what post-attack detection is all about. Sharing this data is vital, and collaborative malware platforms and schemas are out there for just this purpose. You can submit a malware sample, or a packet capture file, and get back information on whether someone has seen this before.
You'd guess that this intelligence involves mountains of data, and you're right. It goes beyond "big data." It's moby data. To do all this analysis of live network traffic, compare it with historical traffic, and analyze it using a huge store of intelligence is more than a mortal server can handle. Doing it at line speed requires specialized hardware. Refreshing that data so that it's as current as possible also requires more firepower than you can buy off the shelf. Doing all this while ignoring a denial-of-service attack that's trying to distract you from it – that's the holy grail.
Blazingly fast plus moby data equals "not in my datacenter." For the most part, nobody is going to build their own infrastructure to do this; that's why cloud-based monitoring and malware detection are on the rise. If you have a minimal presence to capture the network traffic, you can save all the frantic data crunching for the cloud back-end.
Speaking of horses already having left the barn, there's also another way to skin the malware detection cat.* Remember that a lot of your network traffic goes out on the Internet, and it can be logged and monitored. If some of your infrastructure is talking to known command-and-control systems, or other known compromised systems, that's a pretty good sign that you've got malware. There are security vendors out there that do this listening to the whispers and echoes on the Internet at large, and they can tell you whether you've been compromised – no software or hardware installation required. You can't get lighter-touch than that.
That's the upside. The downside is that there are people out there who already know you've been 0wn3d. If you do nothing else, it might be a good idea to go find them and ask.
*Yes, this is Cliche Menagerie as a Service.
Wendy Nather is Research Director of the Enterprise Security Practice at the independent analyst firm 451 Research. You can find her on Twitter as @451wendy.