OrbitalReg Sign in →

Supply-chain case study · retrospective

xz-utils · March 2024

The xz-utils backdoor: when the source tree lies about the tarball.

CVE-2024-3094 — the most patient supply-chain attack on record. A persona built credibility as a co-maintainer of liblzma over two years, then planted an SSH backdoor that lived only in the release tarball — the git source tree was clean. Caught by accident: an SSH login that was 500 ms slower than it should have been.

Attack chain

Two years in the making.

  1. 01. Patient social engineering. From late 2021, the persona "Jia Tan" submits steadily improving patches to xz-utils. Sock-puppet accounts pressure the original maintainer, Lasse Collin, to "share the load." By mid-2023, Jia Tan is a trusted co-maintainer with release-signing rights.
  2. 02. The tarball / git divergence. Release tarballs xz-5.6.0 and 5.6.1 ship with an obfuscated payload embedded in test fixtures — binary files that are part of the tarball but not in the git tree. Build-time M4 macros conditionally splice the payload into liblzma.so during dist builds.
  3. 03. RCE via systemd-linked OpenSSH. The backdoor hijacks the RSA key-verification path inside sshd whenever liblzma is loaded through systemd. Anyone holding the attacker's private Ed448 key gets pre-authentication remote code execution as root on the target.
  4. 04. Found by latency. March 29, 2024: Andres Freund, a Microsoft engineer, notices ssh logins on Debian sid are 500 ms slower than the day before. He profiles. The CPU is burning in liblzma. He reverse-engineers the payload, confirms RCE, files a public advisory within hours. The major distros yank the affected versions over the weekend.
The critical property: a code review of the git source tree would not have found the backdoor. The attack lived in build-time M4 macros plus binary test fixtures that no reasonable diff tool surfaces. The only signal was the discrepancy between what's in git and what's in the release tarball — exactly the kind of divergence that build-info provenance is supposed to catch.

The structural gap

Why CVE-scanning alone wouldn't have caught this.

Traditional CVE-detection scanners (Trivy, Grype, OSV) compare an artifact's declared version against a database of known-bad versions. For Shai-Hulud-class attacks where the bad code lands in a numbered release of a known package, that loop closes within hours of disclosure.

For xz-utils, that loop didn't close for two years. The poisoned tarballs were the official releases, signed by the legitimate signing key, with no GHSA advisory and no CVE entry. Any CVE-scanner pulling fresh tarballs in early-to-mid March 2024 would have given them a green light.

The only place the attack was visible was at the boundary between source and binary — and that's a different kind of gate entirely. Not "does this artifact match a known-bad version?" but "does this binary actually match the source that claims to produce it?"

The OrbitalReg defense pattern

Three layers that prove a binary matches the source it claims.

01

Build-info provenance

Every artifact stored in OrbitalReg carries its CI run, commit SHA, builder identity, and the upstream source URL. When a new version of xz-5.6.0 lands, the registry records "this came from upstream tarball at hash X, which the maintainer published while their Sigstore identity was Y." Provenance anchors the artifact in the chain of custody.

02

Reproducible-build gate

For artifacts where upstream commits to reproducible-build guarantees (Debian, NixOS, recent Rust, recent Go), OrbitalReg can be configured to rebuild from source and refuse to serve any binary whose hash differs. An xz-5.6.0 tarball that splices in extra bytes at build time would not match the deterministic rebuild — gate trips, binary quarantined.

03

Sigstore identity pin

Pull-gate can require not just any Sigstore signature but a specific Fulcio-issued identity tied to a known upstream account. If the signing identity for the project rotates without warning — or if a new identity appears on the release — the gate refuses to serve.

What this composes to. An OrbitalReg-mediated registry serving xz-utils with build-info provenance and a reproducible-build gate enabled would have produced one of three states for xz-5.6.0: (a) served and matched — fine; (b) served but flagged as unverifiable — humans review; (c) rejected because rebuild mismatch — humans dig. State (b) or (c) would have surfaced the discrepancy in March 2024. Whether that would have been investigated in a week or in two days is a function of your team, not the gate. The point is the gate raises the signal instead of waiting for a postgres developer to notice 500 ms of latency.

Config sketch

Provenance + rebuild gate.

OS-package repos with upstream reproducible-build guarantees can be gated on rebuild verification. Slower than CVE-only, but catches a class of attack CVE-scanning structurally can't.

# Mirror Debian main, require provenance + rebuild verification.

orbital repo create debian-main \
  --format=debian --kind=remote \
  --upstream=https://deb.debian.org/debian \
  --scanner=trivy,osv \
  --require-provenance=build-info \
  --reproducible-build=verify-or-quarantine

Reproducible-build verification is optional, off by default. Enable per-repo where upstream supports it. Full grammar at docs.orbitalreg.com/guide/pull-gates.

For your records

xz-utils timeline.

2021-10
"Jia Tan" begins submitting patches to xz-utils. First merged contributions land. Sock-puppet accounts begin a multi-year campaign of pressure on maintainer Lasse Collin to share commit access.
2023-mid
Jia Tan is granted co-maintainer status with release-signing rights. No malicious commits visible in the git log to this point.
2024-02-24
xz-utils 5.6.0 released with the poisoned test fixtures and M4 macros. Tarball signed by Jia Tan's key. Distro maintainers begin pulling it into testing branches.
2024-03-09
xz-utils 5.6.1 released. Refinements to the backdoor. Still no public CVE, still no GHSA.
2024-03-29
Andres Freund discloses the backdoor publicly via the oss-security mailing list. CVE-2024-3094 issued, CVSS 10.0. Distros yank affected versions over the weekend; the impact is limited to Debian sid, Fedora 40/41 pre-release, and a few rolling-release distros — the unstable channels caught it before stable.

Honest caveats

What this does not claim.

We would not have caught Jia Tan's social engineering. By March 2024 he was a legitimate co-maintainer with valid signing keys. Sigstore-identity gates pass when the identity is the one on file. The defense isn't "stop bad maintainers from existing" — it's "raise an audit signal when their artifacts diverge from their source."

Reproducible-build verification needs upstream cooperation. The gate only works for packages where upstream actually ships a reproducible build (Debian main, NixOS, recent Go, recent Rust). For packages where upstream tarballs are not deterministic, the gate cannot run and has to be turned off. Many ecosystems are not there yet.

Rebuild verification is expensive. Reproducible builds take real compute. We recommend running them on a delay queue (e.g. all new pulls run async within 24 h), with the artifact served from cache in the meantime and quarantined if the delayed rebuild diverges. That's a 24-hour window where bad code can still flow. Not zero risk — substantially less risk.

Andres Freund got lucky. The window between xz-5.6.0 release and disclosure was about five weeks. A provenance gate doesn't shrink that window to zero. What it does is: change the discovery mechanism from "a postgres developer notices SSH is slow" to "the registry refused to serve a binary that didn't match its source, two days after upload."

Primary sources

Want this configuration for your team?

Talk to us about build-info provenance gates.

Rico, the founder, walks you through which of your registries can support reproducible-build verification today, which need a manual quarantine workflow as a stopgap, and the realistic time-to-value for each.