Some of Twitter's proprietary source code had been publicly available on Github for nearly three months, according to information gleaned from a DMCA Takedown request filed on March 24.
GitHub is the world's largest code hosting platform. Owned by Microsoft, it serves more than 100 million developers and contains nearly 400 million repositories in all.
On March 24, GitHub honored a Twitter employee's request to remove "proprietary source code for Twitter's platform and internal tools." The code had been published in a repository called "PublicSpace," by an individual with the username "FreeSpeechEnthusiast." The name is an apparent reference to Elon Musk's casus belli for taking over Twitter back in October (a philosophy which has been unevenly implemented in months since).
The leaked code was contained in four folders. Though inaccessible as of March 24, some of the folder names — like "auth" and "aws-dal-reg-svc" — seem to give some hint at what they contained within.
According to Ars Technica, FreeSpeechEnthusiast joined Github on Jan. 3 and committed all the leaked code that same day. That means, in all, the code was entirely accessible to the public for nearly three months.
How Enterprise Source Code Leaks Happen
Major software companies are built on millions of lines of code and every so often, for one reason or another, some of it can leak.
"Bad actors, of course, play a major role," says Dwayne McDaniel, developer advocate for GitGuardian. "We saw it last year in cases like Samsung and Uber involving the Lapsus$ group."
Hackers aren't always a part of the story, though. In Twitter's case, circumstantial evidence points to a dissatisfied employee. And "a good deal of it also comes from code ending up where it does not belong unintentionally, as we saw with Toyota, where a subcontractor made a copy of a private codebase public," he adds. "The complexity of working with git and CI/CD combined with an ever-growing number of repos to deal with for modern applications means code on private repos can become public by mistake."
The Problem of Source Code Leaks for Enterprises
For Twitter and companies like it, source code leaks can be a much bigger problem for cybersecurity than copyright infringement. Once a private repository becomes public, all kinds of harm can follow.
"It's important to remember that source repositories often contain more than just the code," notes Tim Mackey, principal security strategist for the Synopsys Cybersecurity Research Center. "You'll find test cases, potentially sample data along with details on how the software should be configured."
There may also be sensitive personal information and authentication information hidden in the code. For example, "for some applications that are never intended to be shipped to customers, the default configuration contained in the source code repository might just be the running configuration," Mackey says. Hackers can use stolen authentication and configuration data to carry out bigger and better attacks against the victim of a leak.
That's why "companies should adopt a more secure secrets management strategy, combining secrets storage with secrets detection," says GitGuardian's McDaniel. "Organizations should also audit their current secret[s] leakage situation to know what systems are at risk if a code leak does occur and where to focus prioritization."
But in cases where the leak comes from the inside — like Twitter's — even greater caution is warranted. It requires thorough threat modeling and analysis of an enterprise's source code management, says Mackey.
"This is important because if someone can trigger a source code leak, then they may also have the ability to change the source code," he says. "If you're not using multifactor authentication for access, enforcing limited access to only approved users, enforcing access rights, and access monitoring, then you may not have a full picture of how someone might exploit the assumptions your development teams have made when they secured their source code repository."