News, news analysis, and commentary on the latest trends in cybersecurity technology.
We Have the Tech to Scale Up Open Source Vulnerability Fixes — Now It's Time to Leverage It
Q&A with Jonathan Leitschuh, inaugural HUMAN Dan Kaminsky Fellow, in advance of his upcoming Black Hat USA presentation.
August 8, 2022
With enterprise software more dependent on open source components than ever before, a big element of modern application security is reliant on how well the open source community can shore up vulnerable code. The Black Hat USA presentation "Scaling the Security Researcher to Eliminate OSS Vulnerabilities Once and For All" will tackle some important tools and strategies that can amp up the progress on this front.
Jonathan Leitschuh, the inaugural Dan Kaminsky Fellow at HUMAN Security, plans to examine how the security research community can use automated bulk pull request generation to collaborate with open source maintainers in a way that can make it easier to address drastically larger numbers of high-risk flaws.
Leitschuh used his fellowship year to work on refining the tools and methods for scaling open source security vulnerability remediation. Dark Reading caught up with Leitschuh to discuss his upcoming talk and dive deeper into his work. (The interview has been lightly edited for conciseness and clarity.)
Dark Reading: What do you think the No. 1 takeaway will be for the audience at your Black Hat talk?
Leitschuh: My presentation will examine open source vulnerabilities and how several are more widespread than you’d think. For example, there is one project owned by Perforce, called zeroturnaround/zt-zip, that received all three of the security pull request fixes I attempted to fix across open source.
Importantly, the highest impact finding I intend to share is how it is possible to fix widespread and common security vulnerabilities at scale. We have the technology. All we need to do is leverage it.
My goal is to demonstrate that fixing these vulnerabilities is not an interactive problem. We can solve it with math, science, technology, and security. You have to be accessible, flexible, and open-minded if you want your proposed fixes to be accepted.
It’s not that people are asleep at the wheel when it comes to vulnerabilities. They often just aren’t aware of certain problems, and to compound things the Web is insecure by default.
Dark Reading: When we first talked at the beginning of the year, you were just getting your feet wet in the fellowship and planning out your year. Can you offer me an update on the progress that you made in utilizing CodeQL, OpenRewrite, and other tools to scale up open source vulnerability mitigation?
Leitschuh: I ended up having to reimplement features, including Data Flow and Control Flow analysis, that were missing or partially implemented in OpenRewrite in order to support the vulnerabilities I intended to fix. The Control Flow feature was particularly tricky but was possible thanks to an excellent collaboration with my intern, Shyam Mehta. He had taken a few classes in college that I had not, in particular, one about compilers that was particularly helpful in our work.
We generated over 400 pull requests to fix new instances of old vulnerabilities from my previous research and generated over 170 pull requests to fix new vulnerabilities that wouldn’t have been possible without OpenRewrite.
Dark Reading: Did you run into any surprises?
Leitschuh: The feedback from maintainers has been, in general, positive, but it’s always interesting to see how careful you need to be around OSS maintainers. It is very important to make sure the automated fix you are making looks like the surrounding code. One of the biggest pushbacks from maintainers I’ve received is about not including unit tests with the pull requests. Unfortunately, their codebase is most often far too complex to automatically generate a unit test in addition to the fix.
Dark Reading: What was the most impactful finding/technique/application of tools you discovered over the course of the year?
Leitschuh: To begin with, I had to start from scratch and conduct a data analysis. Using CodeQL was key as I needed to translate my knowledge not only across different programming languages, but also from query language to a procedural one.
Shyam was instrumental in making sure I was correctly building this new feature, which we needed for the final security vulnerability: Zip Slip, which is unzipping a zip file in such a way that one can arbitrarily overwrite file contents, potentially allowing for remote code execution.
As a note, Snyk had already done some work on this, but some of the mitigations their researcher worked with maintainers to craft were found to not have been 100% correct.
Dark Reading: Do you have a good example of your techniques in action?
Leitschuh: As I became aware of more existing research, I knew there were more cases of security vulnerabilities waiting to be found.
From a CodeQL query, the GitHub Security Lab team provided me with a list of 900 repositories potentially vulnerable to Zip Slip. Of the 900, we have made 86 Zip Slip fix pull requests to date, which means 86 critical security vulnerabilities that now have possible fixes.
Dark Reading: Are there any loose ends that you hope the community can chip in and work on?
Leitschuh: There is still a significant gap where we need the community’s help, especially surrounding the gap between the 900 repositories and 86 fixes.
The list of projects included archived and other unmaintained projects, which means the vulnerabilities may not be fixed any time soon, but at least they are now visible, along with potential fixes.
There’s a wealth of open source vulnerabilities that are just waiting to be fixed with more advanced techniques.
Dark Reading: In all practicality, what do you think it will take for practitioners to put your learnings into action?
Leitschuh: Ideally the first step would be to learn CodeQL so that they can express searches for vulnerabilities. Then learning to use something like OpenRewrite in order to generate fixes. Then the best practices around bulk PR submission.
It would also be beneficial for practitioners to have at least a basic understanding of the language in use in a particular project, as this allows them to better understand the vulnerabilities and risks involved.
Curiosity is also an essential component so they can explore and dig into the vulnerability. For example, a practitioner could run a cursory search with GitHub Code Search, compile examples, and then write several unit tests to assess how you can address the vulnerability head on.
Dark Reading: How have your views changed over the course of the year on how you think we should work toward solving the biggest problems of software supply chain security today?
Leitschuh: The software security supply chain is a little different as my work is about the actual vulnerabilities in the project. However, there’s a lot of value in deep dives for security vulnerabilities, but it ’s become very clear to me that there’s a lot of security vulnerabilities on the surface.
We know these vulnerabilities are out there. You put the scanners in the hands of a maintainer, they see a lot of noise. They have to filter out what’s good and what’s bad. With a pull request, even if you don’t fix it, it still, hopefully, hardens your software.
I knew this before I came in, but it became very clear to me how picky maintainers can be. There’s reaffirmation that you have to get the code and formatting right; you can’t skimp on messaging and overall reacting accordingly when you’re seeing reactions from maintainers.
People get their ego wrapped up in their software, which I admittedly can be guilty of. You’re challenging them in a certain way – even acting as a threat. As an example, you didn’t just write this wrong, you’ve written it in a way that is a true security vulnerability. It’s bigger than a bug. It’s important to have a healthy respect for the human aspect of open source software.
Dark Reading: Why do you think it is important for security practitioners and researchers to bring more respect to the table for maintainers of OSS projects? How can cybersecurity improve as a result?
Leitschuh: I certainly understand the plight of an open source maintainer. It’s tough and demanding, and most are volunteering their time. They are an important line of defense against malicious actors and have to handle broad swathes of open source projects.
You need to be cognizant that the maintainers and owners of the projects are doing this in their spare time, so it is important to actively collaborate with them, as opposed to expecting them to always accept your suggested changes from the start.
We do our best work in cybersecurity when we work collaboratively. Maintainers are part of the process and should be identified and respected as such.
Dark Reading: What's next for you and your research as you come out of the fellowship?
Leitschuh: I’m not entirely sure yet. I’d love to see this research taken to the next level. There are certain vulnerabilities like SQL injection that can be deterministically discovered with data flow and taint tracking (because CodeQL already does it).
The complicated bit is turning a detection into a fix. Software developers write code in a lot of different ways, trying to write fixes for all the different ways developers can write code is a difficult task.
Leitschuh is co-presenting with Patrick Way of Moderne (one of the open source maintainers for OpenRewrite) and Shyam Mehta, a software engineer studying at the University of Pennsylvania. Way taught Leitschuh "OpenRewrite from the ground up," and Mehta "has been instrumental over the course of the fellowship," Leitschuh says.
"Whenever I need to solve a problem and can move a little slower, or I need to think something through more as I’m building it, I pair-program with [Mehta], where he’s writing the code and I’m providing instructions. He has also helped by conducting a control flow analysis, created the UI for two large Control Flow example graphics used in the talk, and with debugging," Leitschuh says.
Read more about:
Black Hat NewsAbout the Author
You May Also Like
Cybersecurity Day: How to Automate Security Analytics with AI and ML
Dec 17, 2024The Dirt on ROT Data
Dec 18, 2024