So why the discrepancy in size estimates of the Flashback Trojan botnet, and does anyone really care? The wide ranges of counts on the game-changer botnet for Macs was a case study for how gauging the size of a botnet is less a science than an art. Different research groups set up their own sinkholes to lure unsuspecting bots in order to get a handle on the size and activity of a botnet, but each basically sees just a snapshot of the overall botnet, and botnets are notoriously fast-moving targets as infections come and go. That's why Jose Nazario, senior security researcher for Arbor Networks, wants to come up with standard sinkholing methods.
"Some who are actively sinkholing [bots] are good at it, and some are not," Nazario says. "Some of us are working behind the scenes of how to come up with standardization for sinkholing methodologies."
Part of the problem, he says, is that sometimes marketing trumps science in botnet data. And when government officials quote botnet sizes, they rely on data generated by security researchers, many of whom work for security vendors, he says. "If we're going to inform [policy-makers], we need to come with numbers that we believe are legitimate," Nazario says.
The catch with bot-counting is that, for the most part, you can only measure a snapshot of infected machines or IP addresses during a specific period of time, and then that information is used to generate an estimate of the total number of infected machines making up the botnet. Botnet population data can help researchers prioritize which threats to focus on and create the appropriate defenses, as well as pinpoint the geographic areas most hit by the infection, for instance, according to Nazario.
The Messaging Anti-Abuse Working Group (M3AAWG), under a new Federal Communications Commission project, hopes to offer up more accurate bot counts. It will begin publishing quarterly reports of the total number of bot infections out there, based on numbers provided by Internet service providers, which arguably have a more comprehensive view of the problem.
M3AAWG expects the project to provide a more comprehensive count of the numbers of machines that are owned by botnets, but the catch is that it's voluntary for ISPs to provide the data. The project will count bots on residential networks using only aggregated, anonymous data.
"The key challenge in gathering the bot counts has been developing a set of metrics that many companies can consistently report on. As you can imagine, many companies have different reporting systems and different definitions of exactly what constitutes a bot," says Jerry Upton, executive director of M3AAWG. "The current bot numbers have been a little confusing because we've only had incomplete data. Our data won't be all-embracing, but it will be much broader and more comprehensive."
Member ISPs and others who want to contribute their data can take part in the bot metrics program, Upton says. "It's to the ISP's benefit to participate so that as an industry we can broaden our understanding of the problem. Network operators can contact us and we’ll gladly work with them to obtain their data," he says.
But even ISP numbers can be deceiving, Arbor's Nazario says. "The idea is that they are closer to infected devices, but ISPs are still doing network measurements," he says. "That's going to be incomplete versus a complete sinkhole or peer-to-peer spidering."
[ Sometimes the good guys get caught in the crossfire of the war against botnets: But that risk comes with the botnet-fighting territory these days as security firms engage more aggressively with botnet operations, and overlapping research can be inadvertently destroyed along with part of the botnet. See Botnet Takedowns Can Incur Collateral Damage. ]
Nazario says the key is for researchers from different vendors and organizations to share how they measured bots or were able to reduce the size of a botnet, for instance, in a sort of lessons learned and best practices-sharing exercise.
In the case of the Flashback headcount discrepancies, Nazario says some of the players weren't used to working and collaborating with other researchers. "There's been some difficulty in coordinating efforts: Some were reluctant to work with outsiders," he says. "It's been a really challenge to coordinate that effort, and that's why the numbers are all over [the place]."
Microsoft's recent reporting on the remaining number of machines infected by Conficker was an eye-opener on the persistence of some botnet threats: After the wildly successful industry coalition to combat Conficker three years ago, the worm spread to 1.7 million Windows machines worldwide by the end of last year.
The Conficker Working Group, headed by Microsoft, effectively shut down Conficker's underlying botnet infrastructure more than two years ago, severely wounding the botnet that had infected some 6.5 million infected machines. But Conficker, which was written to automatically spread via weak passwords and vulnerabilities that were later patched by Microsoft, lives on in its decapitated form in a shocking number of Windows machines in businesses, according to Microsoft's newest Security Intelligence Report (SIR) Version 12.
Arbor's Nazario says Microsoft has some of the best methods of counting bot-infected machines. "Microsoft is counting PCs versus network measurements, so they are 10- to 100-fold higher routinely," he says. "It's staggering the numbers of how big some of these botnets really are."
In order to tackle the botnet problem, you need good numbers that reflect the scope of the infections, experts say.
"You can’t solve a problem if you don’t know the scope. You need to define the scale of the problem so that going forward you know what is working and has been most effective in reducing bots," M3AAWG's Upton says. The bot metrics pilot program is currently under way, and Upton says the organization will compare notes with other countries with similar programs in place.
Consistency is key. "I am arguing for consistency in methodologies so we can accurately inform people of the problem -- policy-makers or technology advocates," says Arbor's Nazario, who recently gave a presentation on counting bots at the APCERT meeting in Bali, Indonesia. "Getting a handle on how big the problem is, then comes the ability to compare numbers and understand why some methods for remediation are working, and others are not."
Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.