Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Vulnerabilities / Threats

10:35 PM

Researchers Hunt Sources Of Viruses, Memes

Swiss university researchers propose a method for tracking back biological infections using incomplete data that could work for digital viruses and informational memes

A mathematical model using sparse clues to locate the source of a biological epidemic could give data scientists tools to find the origin of rumors spread over social networks or the initial compromise of an online attack.

In a paper (PDF) published in this month's Physical Review Letters, Pedro Pinto and two colleagues at the Swiss Federal Institute of Technology in Lausanne (EPFL) found that a relatively low number of observers or sensors could gather enough information to determine the source of an infection with 90 percent probability. By understanding the diffusion process -- whether it be of viruses, chemicals, or information -- researchers could use the model to estimate the number of sensors necessary to find the source of a variety of epidemic-like processes, says Pedro Pinto, a post-doctoral student at EPFL.

"Other papers look at knowing everything at every individual node, but we put on a practical constraint: that you only know information from a limited number of sensors," he says.

The researchers investigated four different types of network graphs, finding that under the best circumstances -- choosing highly connected nodes to be observers -- only 4 percent of nodes need be recorded to have a 90 percent chance of tracking back to the source. For a random selection of nodes, the proportion of observers that need to be selected is much higher -- up to 49 percent in the worst case.

Other researchers called into question how applicable the research would be to the digital world. While the EPFL researchers verified their technique using real-world data on a cholera outbreak in South Africa -- finding they could get within four hops of the actual source -- the digital world is a different beast, says Stefan Savage, a professor of computer science at the University of California at San Diego.

"The basic idea here is reasonable -- trying to infer origins by reversing the dynamics of the spreading process -- although not totally new in a cybercontext," Savage says. "But this is one of those cases where the devil is in the details: What can you actually observe, where are there, in fact, strong topological dependencies, and how hard is it for the adversary to mask their origins?"

Even EPFL's Pinto acknowledges that the technique relies on how well the specific environment can be modeled. Each case has to be modeled as a tree or graph of nodes and -- while the researchers only depended on the timing of infection -- other information could be taken into account as well. Pinto is evaluating cases involving Internet security that could be the focus of further research.

"Each application requires us to tweak the model," he says. "There are always little details that are different for each case."

[ Traditional cybercriminals are using the same hacking tools that cyberespionage attackers employ in order to maintain a stealthy foothold inside a victim organization. See The Intersection Between Cyberespionage And Cybercrime. ]

For example, in the case of the South African cholera outbreak, each node was a human community and associated water reservoir, while the edges of the graph were waterways and other means of spreading the outbreak.

Modeling computer networks as connected graphs is much easier, and timing data is generally much more precise -- at least in local-area networks, says Richard Bejtlich, chief security officer for security services firm Mandiant.

"This is one of the few areas where the digital world has an advantage over the physical world," Bejtlich says. "For example, we put our software in an enterprise, and we sweep the network -- we can ask questions and get answers back. To do that in a human population, we would have to ask everyone to submit to blood tests."

Finally, the problem of tracking back the outbreak of digital worms and viruses to their sources is not a new subject of security research. In 2005, a group of researchers from Carnegie Mellon University found that retracing a worm's trail (PDF) could be done using random walks back to the source. Yet even that approach assumed the availability of complete data, which may not be the case.

"It is likely that traffic auditing will be deployed incrementally across different networks," stated the paper. "We are investigating the impact of missing data on performance."

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
7 Old IT Things Every New InfoSec Pro Should Know
Joan Goodchild, Staff Editor,  4/20/2021
Cloud-Native Businesses Struggle With Security
Robert Lemos, Contributing Writer,  5/6/2021
Defending Against Web Scraping Attacks
Rob Simon, Principal Security Consultant at TrustedSec,  5/7/2021
Register for Dark Reading Newsletters
White Papers
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you today!
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2021-05-14
TensorFlow is an end-to-end open source platform for machine learning. If the `splits` argument of `RaggedBincount` does not specify a valid `SparseTensor`(https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor), then an attacker can trigger a heap buffer overflow. This will cause a read ...
PUBLISHED: 2021-05-14
TensorFlow is an end-to-end open source platform for machine learning. An attacker can cause a denial of service via a FPE runtime error in `tf.raw_ops.DenseCountSparseOutput`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/efff014f3b2d8ef6141da30c806faf141297eca1/t...
PUBLISHED: 2021-05-14
express-hbs is an Express handlebars template engine. express-hbs mixes pure template data with engine configuration options through the Express render API. More specifically, the layout parameter may trigger file disclosure vulnerabilities in downstream applications. This potential vulnerability is...
PUBLISHED: 2021-05-14
haml-coffee is a JavaScript templating solution. haml-coffee mixes pure template data with engine configuration options through the Express render API. More specifically, haml-coffee supports overriding a series of HTML helper functions through its configuration options. A vulnerable application tha...
PUBLISHED: 2021-05-14
Squirrelly is a template engine implemented in JavaScript that works out of the box with ExpressJS. Squirrelly mixes pure template data with engine configuration options through the Express render API. By overwriting internal configuration options remote code execution may be triggered in downstream...