With Executive Order 14028, a large regulatory push toward mandating the production of a software bill of materials (SBOM) began. As this new buzzword spreads, you'd think it was a miracle cure for securing the software supply chain. Conceptually, it makes sense — knowing what is in a product is a reasonable expectation. However, it is important to understand what exactly an SBOM is and whether or not it can objectively be useful as a security tool.
SBOMs are meant to be something like a nutrition label on the back of a grocery store item listing all of the ingredients that went into making the product. While there currently is no official SBOM standard, a few guideline formats have emerged as top candidates. By far, the most popular is the Software Data Package Exchange (SPDX), sponsored by the Linux Foundation.
SPDX, as with most other formats, attempts to provide a common way to represent basic information about the ingredients that go into the production of software: names, versions, hashes, ecosystems, ancillary data like known flaws and license information, and relevant external assets. However, software is not as simple as a box of cereal, and there is no equivalent to the Food and Drug Administration enforcing compliance to any recommended guidelines.
Everything Is Optional, Therefore Nothing Is Required
One of the biggest and most obvious gaps in SPDX and other standards is that, like the open source ecosystem at large, it is a set of communities built on general guidelines, not verified standards. Open source communities are wide and sprawling, with contributors from across the globe. Often, package managers and hosting environments are designed by separate committees, each operating in isolation. While SBOM mandates represent an early attempt to unify these many silos of information, they suffer from some of the fundamental limitations of applying a standard retroactively.
First and foremost, the cracks begin to show when we consider what SPDX actually provides. As it turns out, nearly every field in the standard is optional. While this optionality of fields is clearly an artifact of how different the many software ecosystems the format attempts to cover really are, an SBOM loses its value as a real source of truth if there is no requirement for cryptographic hashes or other information that would allow us to positively associate the library with the information presented in the SBOM document.
SBOMs Miss the Mark on Provenance
In addition to being incomplete, many of the inputs cannot be properly represented in the current formats, especially from places like version control system repositories. While some emerging frameworks like SLSA attempt to apply notions of provenance, no format hits the mark. We must incorporate accurate provenance data into the SBOM format in order to ensure that we get a reasonable view of what is inside the software we are consuming. This should include the following, at minimum:
- Author identity information: Both maintainers and contributors are a critical part of the software supply chain. They hold the proverbial keys to the kingdom, and choose how the software we consume downstream is released. Even if complete data cannot be captured, we can absolutely do better than what is provided today, which is nothing at all.
- Development tools: This includes build tools, CI/CD infrastructure, and tools used during the development of upstream software. What good is securing your internal development infrastructure if a compromise in any library you use could still cause an internal compromise?
- Controls and protections: If one of the goals is third-party risk management, we must ask: What does the security posture of the development team look like? Is branch protection being used? What sort of review processes are applied?
- Artifact attestation: We absolutely must be able to tie an entry within an SBOM document back to an actual artifact. This should include both an actual build artifact, such as a compiled package, as well as continuously developed artifacts such as tagged branches in a version control system.
Realistically, we must build an understanding of each of these elements to comprehend what goes into the software we are utilizing before even considering how an SBOM may be operationalized.
We Can't Rely On a Simple Snapshot in Time
Finally, the biggest challenge with broad SBOM adoption is that the static format only provides the truth in that moment. In order to be meaningful, SBOMs must have the ability to be delivered and accessed continuously.
Consider, for instance, that you were to receive a vendor SBOM from your cloud storage provider. Not only would this SBOM be a document containing tens of thousands of individual pieces of software, but it would also contain many multiples of versions of those software packages. Now, consider that by the time you receive that information, it is likely already out of date. This is because large enterprises often ship hundreds or thousands of builds per day. By the time you have processed, consumed, and reasoned about an SBOM for any large vendor, the data is already stale.
Versions will have changed, new packages will have been added, and some packages may have been removed. Modern development best practices mean that builds and delivers are meant to be continuous, which leads to faster iteration and faster release cycles. It also means that the delivery of a single SBOM is all but meaningless. What this really implies is that there is a strong requirement for scalable automation around SBOMs in order for them to provide true value.
The intent behind SBOMs is to pave the way to improved visibility, both internally and externally. However, don't be fooled into thinking they will secure your software supply chain. We need more than an incomplete snapshot in time to have real impact.