News, news analysis, and commentary on the latest trends in cybersecurity technology.

Protect AI Releases 3 AI/ML Security Tools as Open Source

NB Defense, ModelScan, and Rebuff, which detect vulnerabilities in machine learning systems, are available on GitHub.

Dark Reading Staff, Dark Reading

October 11, 2023

2 Min Read
Automation data analytic with 3d rendering ai robot with digital visualization for big data scientist
Source: Kittipong Jirasukhanont via Alamy Stock Photo

Protect AI, maker of Huntr, a bug-bounty program for open source software (OSS), is venturing further into the OSS world by licensing three of its artificial intelligence/machine learning (AI/ML) security tools under the permissive Apache 2.0 terms.

The company developed the first tool, NB Defense, to protect ML projects being developed in Jupyter Notebooks, a common application favored by data scientists. As it became widely used, hackers began targeting Jupyter because of the power inherent in a tool meant to test code and packages. In response, Protech AI developed NB Defense, a pair of tools for scanning Notebooks for vulnerabilities, such as secrets, personally identifiable information (PII), CVE exposures, and code subject to restrictive third-party licenses. The JupityrLab extension finds and fixes security issues within a Notebook, whereas the CLI tool enables scanning of multiple Notebooks simultaneously and automatic scanning of those being uploaded to a central repository.

The second tool, ModelScan, originated from the need for teams to share ML models across the Internet, as more companies are developing AI/ML tools for internal use. ModelScan scans Pytorch, Tensorflow, Keras, and other formats for model serialization attacks,such as credential theft, data poisoning, model poisoning, and privilege escalation (wherein the model is weaponized to attack other company assets).

The third tool, Rebuff, was an existing open source project that Protect AI acquired in July and has continued developing. Rebuff addresses prompt injection (PI) attacks, in which an attacker sends malicious inputs to large language models (LLMs) in order to manipulate outputs, expose sensitive data, and allow unauthorized actions. The self-hardening prompt injection detection framework employs four layers of defense: heuristics, which filters out potentially malicious input before it reaches the model; a dedicated LLM, which analyzes incoming prompts to identify potential attacks; a database of known attacks, which helps it recognize and fend off similar attacks later on; and canary tokens, which modify prompts to detect leaks.

The spread of AI and LLMs across organizations of various sizes has led to a similar rise in tools to secure — or attack — such models. For example, HiddenLayer, this year's winner of the RSA Conference Innovation Sandbox, made securing those models against tampering its mission. In 2021, Microsoft released a security framework and its own open source tools for protecting AI systems against adversarial attacks. And the flaws just announced in TorchServe underscore the real-world stakes for even the biggest players, such as Walmart and the three major cloud service providers.

All three of Protect AI's tools — NB Defense, ModelScan, and Rebuff — are available on GitHub.

About the Author(s)

Dark Reading Staff

Dark Reading

Dark Reading is a leading cybersecurity media site.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights