Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Threat Intelligence

10/2/2020
05:15 PM
50%
50%

Researchers Adapt AI With Aim to Identify Anonymous Authors

At Black Hat Asia, artificial intelligence and cybersecurity researchers use neural networks to attempt to identify authors, but accuracy is still wanting.

With disinformation on social media a significant problem, the ability to identify authors of malicious articles and the originators of disinformation campaigns could help reduce the threat from such information attacks.  

At the Black Hat Asia 2020 conference this week, three researchers from Baidu Security, the cybersecurity division of the Chinese technology giant Baidu, presented their approach to identifying authors based on machine learning techniques, such as neural networks. The researchers used 130,000 articles by more than 3,600 authors scraped from eight websites to train a neural network that could identify an author from a group of five possible writers 93% of the time and identify an author from a group of 2,000 possible writers 27% of the time.

Related Content:

Project Aims to Unmask Disinformation Bots

State of Endpoint Security: How Enterprises Are Managing Endpoint Security Threats

New on The Edge: CFAA 101: A Computer Fraud & Abuse Act Primer for InfoSec Pros

While the results are not impressive, they do show that identifying the person behind a piece of writing is possible, said Li Yiping, a researcher at Baidu Security, during his presentation on his team's work.

"Most fake news is posted anonymously and lacks valid information to identify the author," he said. "Tracking anonymous articles is a challenging problem, but fortunately it is not impossible. Different people have different writing styles, so we are able to identify some writers by their distinct habits." 

Fake news and other forms of disinformation have become an online plague over the past decade. Driven by commercial success, cybercriminals have used fake news to attract page views against which advertising is sold. More insidious, however, are political disinformation campaigns by foreign nations and domestic groups with agendas that can impact public opinion using untrue information. 

In late September, the FBI and the US Department of Homeland Security issued a warning that both foreign actors and cybercriminals will likely use disinformation in various campaigns this election season.

"Foreign actors and cybercriminals could create new websites, change existing websites, and create or share corresponding social media content to spread false information in an attempt to discredit the electoral process and undermine confidence in U.S. democratic institutions," the agencies stated.

A variety of research efforts are underway, aiming to unmask disinformation campaigns. In May, for example, a group of of researchers at NortonLifelock launched BotSight, a plug-in that rates social media accounts on a bot-versus-human scale. The tool uses the known connections between social media accounts to calculate a probability that a specific account is managed by an automated bot or an actual human.

At the Black Hat USA conference, a research manager at the Stanford Internet Observatory argued that Russia tends to focus more on disinformation campaigns involving fake memes and articles, while Chinese efforts focus more on creating legitimate-seeming news sources that espouse a government-approved focus.

Baidu Security's research effort focused on either matching an article to a known author in a list of sources, called the author attribution problem, or determining the likelihood that an article was written by a specific author, known as the author verification problem. The researcher trained a neural network using a series of triplets of article data: an anchor article written by an author, an article that positively matches the author, and an article that was not written by the author. 

By using a dynamic method of selecting such only a small share of possible triplets, the research team created a training data set to create a neural network that identifies the author of an article. In an experiment using seven datasets of increasingly complexity, the researchers found their method worked well, with 93% accuracy, in attributing any of 600 articles written by five different authors, but was only 27% successful in attributing more than 70,000 documents written by any of 2,000 different authors. 

Researcher Li noted that, even at such a low accuracy with a high number of documents, the Baidu team's approach had better accuracy than other methods.

"Our method outperformed other baselines, especially when the data sets get large," he said. "In the future, we will continue to test our model and optimize our deep learning network and triplet selection strategy."

Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio
 

Recommended Reading:

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 10/27/2020
Chinese Attackers' Favorite Flaws Prove Global Threats, Research Shows
Kelly Sheridan, Staff Editor, Dark Reading,  10/27/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
Special Report: Computing's New Normal
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
How IT Security Organizations are Attacking the Cybersecurity Problem
How IT Security Organizations are Attacking the Cybersecurity Problem
The COVID-19 pandemic turned the world -- and enterprise computing -- on end. Here's a look at how cybersecurity teams are retrenching their defense strategies, rebuilding their teams, and selecting new technologies to stop the oncoming rise of online attacks.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-11484
PUBLISHED: 2020-10-29
NVIDIA DGX servers, all DGX-1 with BMC firmware versions prior to 3.38.30, contains a vulnerability in the AMI BMC firmware in which an attacker with administrative privileges can obtain the hash of the BMC/IPMI user password, which may lead to information disclosure.
CVE-2020-11485
PUBLISHED: 2020-10-29
NVIDIA DGX servers, all DGX-1 with BMC firmware versions prior to 3.38.30, contains a Cross-Site Request Forgery (CSRF) vulnerability in the AMI BMC firmware in which the web application does not sufficiently verify whether a well-formed, valid, consistent request was intentionally provided by the u...
CVE-2020-11486
PUBLISHED: 2020-10-29
NVIDIA DGX servers, all DGX-1 with BMC firmware versions prior to 3.38.30, contain a vulnerability in the AMI BMC firmware in which software allows an attacker to upload or transfer files that can be automatically processed within the product's environment, which may lead to remote code execution.
CVE-2020-11487
PUBLISHED: 2020-10-29
NVIDIA DGX servers, DGX-1 with BMC firmware versions prior to 3.38.30. DGX-2 with BMC firmware versions prior to 1.06.06 and all DGX A100 Servers with all BMC firmware versions, contains a vulnerability in the AMI BMC firmware in which the use of a hard-coded RSA 1024 key with weak ciphers may lead ...
CVE-2020-11488
PUBLISHED: 2020-10-29
NVIDIA DGX servers, all DGX-1 with BMC firmware versions prior to 3.38.30 and all DGX-2 with BMC firmware versions prior to 1.06.06, contains a vulnerability in the AMI BMC firmware in which software does not validate the RSA 1024 public key used to verify the firmware signature, which may lead to i...