Executable compression, a.k.a. "packing,” is a means of compressing an executable file and combining the compressed data with decompression code into a single executable.
Throughout the years, anti-malware vendors have educated their users about polymorphic malware that uses packing applications to repackage itself frequently (ideally every time it gets distributed to a victim) so that anti-malware solutions based on static signatures become useless.
Fundamentally, when packed, an encoded version of the malware is stored in a variable, possibly encoded with a key. At execution time, the program generates the key (if necessary), and then decodes the malware. The malware is then loaded into memory and the unpacker program jumps to the address and executes the malicious payload.
This process can be repeated by extracting additional portions of packed code during the lifetime of a process, sometimes with nested packing (i.e., unpacked code that unpacks more code).
This type of behavior has been very common in malware for a number of years. For this reason, unpacking emulators were introduced by anti-virus vendors. These emulators perform the initial operations required to unpack the actual program code and then perform their static analysis of the unpacked code.
Cyber criminals soon took notice of packing emulators and started introducing anti-emulator mechanisms. These approaches made necessary the use of full-blown sandboxes for the analysis: Only by running the actual program in a realistic environment was it possible to extract the actual behavior of the code. So, in the next step in the neverending battle between good and evil, cyber criminals started introducing anti-sandbox mechanisms into their packers.
The use of increasingly sophisticated anti-analysis techniques in packers suggests a logical question: Why not detect malware by detecting packers? One could decide to simply block executables that appear to be packed, forcing the malware writers to resort to more subtle (and expensive) mechanisms to avoid detection.
Well, the problem is that a substantial portion of benign software is packed as well. We ran an experiment over a dataset of recently observed binaries, and we found that 37% of malware had some form of packing and 6% of benign software uses packing. Note that the packing behavior was observed during execution, and therefore is independent of specific packers or other techniques.
This shows that rejecting a program just because it’s packed is not an effective malware defense strategy.
So what next?
We considered several options for how security teams might be able to use packing behavior to detect malware. Digital signatures
Even though an invalid or missing signature combined with unpacking behavior seems promising given that 97% of our malicious samples shared this characteristic, there are many benign samples (40%) that also have this characteristic. Therefore, using this as the only signal would result in a large amount of false positives.
|Benign Executables||Malicious Executables|
|Valid Digital Signatures||90%||11%|
|Valid Signature and Unpacking Behavior||60%||3%|
How executables are packed
Many packers (usually ad hoc programs) use a number of techniques to prevent reverse engineering. For example, they use multiple levels of packing -- that is, the unpacked executable is actually another packed program -- or they employ sophisticated anti-debugging techniques. Compressing packers and encrypting packers
Compressing packers try to reduce the size of the original program using compression techniques. As a result, the compressed data can still retain some of the statistical properties of the original program. Encrypting packers, instead, perform full encryption of the program, and consequently, the encrypted data tends to be more “random” (more formally, it has a higher entropy).
In all cases, however, one cannot use the information to detect if a packed executable is malicious or not as these techniques also are used by developers of benign applications on a regular basis.
While information about packing is not a suitable approach for effective malware detection, a critical question remains: Is the industry nevertheless using packing as a signal?
A study I helped conduct in 2013 at the University of California in Santa Barbara took almost 8,000 system files from various versions of the Windows operating system and uploaded them to VirusTotal, obtaining an unsurprising “all OK” from all of the anti-malware tools.
Then, we encrypted the same files using four packers (UPX, Upack, NsPack and BEP), resulting in 16,000 verified samples (some of the packed files did not appear to be functional and had to be eliminated from the data set). These samples were then submitted to VirusTotal again, and the results, this time, were surprising: While the samples packed with UPX were not flagged as malicious, 96.7% of the samples packed with the remaining three packers were labeled as malicious by more than ten anti-virus products.
The results clearly show that many anti-virus tools use the identification of packing behavior as a signal for classification as malware, but this was four years ago.
Want to learn more about the tech and business cases for deploying virtualized solutions in the cable network? Join us in Denver on October 18 for Light Reading's Virtualizing the Cable Architecture event – a free breakfast panel at SCTE/ISBE's Cable-Tec Expo featuring speakers from Comcast and Charter.
In order to verify the state of art today, we reproduced, on a smaller scale, the 2013 experiment. We took ten benign samples and we packed them with Obsidium, a commercial packer tool, and then we submitted the samples to VirusTotal.
First of all, an important disclaimer: The engines on VirusTotal are not configured in the most effective way, and therefore, the results must be taken with a grain of salt. For this reason, we do not single out any specific vendor, and instead we show only the aggregate results.
Our findings were that packing is still used as a signal, as many vendors, including top players in the AV industry, identified benign programs as malicious only because they were packed. Of the 64 AV tools used, an average of 25% identified each benign sample as malicious.
|# of AV tools that Analyzed the Sample||# of AV tools that Categorized the Sample as Malicious|
|Benign Sample 1||64||19|
|Benign Sample 2||64||20|
|Benign Sample 3||62||6|
|Benign Sample 4||64||18|
|Benign Sample 5||64||20|
|Benign Sample 6||64||19|
|Benign Sample 7||64||18|
|Benign Sample 8||64||16|
|Benign Sample 9||64||16|
|Benign Sample 10||62||14|
The lesson learned is that packers are not a reliable way to determine the nature of an executable. Instead, it is necessary to run the sample, trigger the unpacking, observe how the unpacking is performed, and combine this information with the actual behavior of the program.
Of course, this requires more resource than a simple static analysis, but, nowadays, it’s either that or inundating security teams with false positives.
- Cybercrime Is North Korea's Biggest Threat
- How Secure Are Your IoT Devices?
- New Vulnerability Hits IoT Cameras
— Dr. Giovanni Vigna has been researching and developing security technology for more than 20 years, working on malware analysis, web security, vulnerability assessment and intrusion detection. He is a professor in the department of computer science and the director of the Center for CyberSecurity at the University of California in Santa Barbara, and is co-founder and CTO at Lastline. You can contact him at [email protected].