When typosquatting is mentioned, most people think of domain typosquatting, which according to the Anticybersquatting Consumer Protection Act (ACPA) of 1999 means registering, trafficking in, or using an Internet domain name with bad-faith intent to profit from the goodwill of a trademark belonging to someone else. Domain (or URL) cybersquatting was commonplace before the passage of the ACPA as individuals looked to profit by registering domains associated with well-known companies and registered trademarks. After the passage of the ACPA and creation of other regulations to resolve disputes over domain name control, there are clear policies and processes in place to address this type of typosquatting.
This article focuses on a different type of typosquatting, called package typosquatting, where there is less oversight and more opportunities for bad actors to cause harm. Here's how it works. Modern software development and usage relies on the use of package managers that support code reuse, including code from registries where developers upload their built software packages for others to download from the Internet and use. Package typosquatting is a type of software supply chain attack where the attacker tries to mimic the name of an existing package on a public registry in hopes that users or developers will accidentally download the malicious package instead of the legitimate one.
Because there is no central body for managing or validating software packages, it's easy for attackers to upload a malicious package that is very similar to the real one, and there are no real repercussions if they are caught. For example, a developer may try to install an image editor that has the filename "moving_images," while a malicious attacker has uploaded a package titled "moving-images." In that instance, an underscore is replaced with a dash. Attackers can also try slight misspellings or flip-flopping the name (e.g., nmap-python instead of python-nmap) in hopes of confusing the developer into picking the malicious package.
While package typosquatting is a relatively obscure issue compared with other attack techniques, it's growing at an alarming rate. In 2018 alone, research indicated more than 100 malicious packages had more than a cumulative 600 million downloads. In April 2020, more than 700 malicious typosquatting libraries were found in the RubyGems repository alone. One of the better-known package typosquatting events occurred in December 2019, when it was reported that two Trojanized Python libraries from PyPI (Python Package Index) were actually mimicking other more popular libraries and, if used, the malicious code would steal SSH and GPG keys from the projects of infected developers.
A simple countermeasure is for developers, while considering what package to add, to do due diligence before they add a package: double-check the package name carefully, look for similar names, and make sure that the package "date added" and "number of downloads" are what they'd expect. This can counter many attacks, but developers can make mistakes, and sometimes malicious packages are already in use, so more needs to be done. On the research side, one security technique being worked on to minimize the threat is the use of string detection algorithms that identify how close two words are to each other with hopes of capturing and flagging misspellings (a common package typosquatting technique), while others are looking at the relative commit activity and popularity of packages with similar names. While there are examples of this work being done in the wild (like the PyPI community for Python), not all package registries have prioritized the prevention of typosquatting attacks.
One way to help mitigate package typosquatting attacks is by using your own internal registry that only references packages that have been determined to be what you expected, such as Sonatype Nexus, JFrog Artifactory, and Google Artifact Registry. Using products like these, it's not too difficult to build your own software registry internally. That way, if you make a typo, you don't have to worry about what someone may have uploaded to take advantage of your mistake. However, even private registries aren't guaranteed to mitigate some attacks, as we've seen with this recent post by Alex Birsan on dependency hijacking.
Another potential solution to fight package typosquatting would be for additional package registries and managers to support namespacing, a technique employed to avoid collisions with other objects or variables in the global namespace. Trusted publishers and identity verification mechanisms would also help mitigate attacks by linking across source code repositories and package registries and making it much more difficult for attackers to act, but there are numerous issues with implementing this type of system that are not limited to package management. It might also help to have more law enforcement involvement (as with ACPA) when typosquatting is detected; typosquatting is often a trademark violation (even if the trademark is unregistered), and creating a package to intentionally access a computer without authorization is against the law in many countries.