orchestratord: version-gate pod security context UID/GID for distroless#36101
orchestratord: version-gate pod security context UID/GID for distroless#36101jasonhernandez wants to merge 3 commits into
Conversation
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
Distroless images run as nonroot (UID 65534) instead of root. Add version-gating so orchestratord sets the correct runAsUser/runAsGroup based on the Materialize version, avoiding UID mismatches during rolling upgrades from Debian-based to distroless images. Gate versions (verified against release history, 2026-06): - balancerd: V26_18_0. Its ci/Dockerfile switched to distroless-prod-base in v26.18.0 (prod-base in v26.17.x). The original V26_19_0 was off by one and would have forced UID 999 onto v26.18.x balancerd pods that actually run as 65534. - environmentd/clusterd: V26_28_0, matching the release that ships their distroless migration (#36099). The original V26_20_0 predated the actual landing by ~8 releases (main is now 26.28-dev) and would have applied UID 65534 to v26.20-v26.27 images that still run as UID 999. NOTE: the env/clusterd gate assumes #36099 lands in the 26.28 cycle. If it slips, bump V26_28_0 to the actual release. The three distroless PRs (#36099 image, #36100 SIGTERM, #36101 this) must ship in the same release. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
89eab31 to
cf2002f
Compare
|
Warning OUTDATED — superseded by the coordination note below. The #36100 dependency is gone (closed; #36099 now ships a static 🔗 Distroless migration — coordination note One of three PRs that must ship together in the same release: #36099 (distroless image), #36100 (SIGTERM handler), #36101 (this, orchestratord UID/GID gating).
Action required before merge: the env/clusterd gate Possible pre-existing issue (separate from this PR): balancerd has been distroless (uid 65534) since v26.18.0, but this orchestrator gating is still unmerged — so v26.18–v26.27 may have a live UID mismatch unless Rebased onto current |
Distroless images run as nonroot (UID 65534) instead of root. Add version-gating so orchestratord sets the correct runAsUser/runAsGroup based on the Materialize version, avoiding UID mismatches during rolling upgrades from Debian-based to distroless images. Gate versions (verified against release history, 2026-06): - balancerd: V26_18_0. Its ci/Dockerfile switched to distroless-prod-base in v26.18.0 (prod-base in v26.17.x). The original V26_19_0 was off by one and would have forced UID 999 onto v26.18.x balancerd pods that actually run as 65534. - environmentd/clusterd: V26_28_0, matching the release that ships their distroless migration (#36099). The original V26_20_0 predated the actual landing by ~8 releases (main is now 26.28-dev) and would have applied UID 65534 to v26.20-v26.27 images that still run as UID 999. NOTE: the env/clusterd gate assumes #36099 lands in the 26.28 cycle. If it slips, bump V26_28_0 to the actual release. The three distroless PRs (#36099 image, #36100 SIGTERM, #36101 this) must ship in the same release. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
main has moved to 26.32.0-dev and 26.28.0 already shipped as a Debian image. Leaving the gate at 26.28 would apply the nonroot 65534 securityContext to the still-Debian 26.28-26.31 env/clusterd images, which expect uid/gid 999. Bump the gate so only the genuinely distroless images get the nonroot context. The gate must equal the release the distroless env/clusterd images (#36099) first ship in. Set to 26.32 on the assumption this lands in the 26.32 cycle. Re-confirm against the actual release cut before merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cf2002f to
cbe3af5
Compare
🔗 Distroless migration — coordinationPart of a set that ships together: #36099 (distroless image), #36872 (clusterd process ordinal), #36876 (removes the unused CTP FQDN check), and this PR (orchestratord UID/GID gating). This PR version-gates the pod securityContext so distroless images (uid/gid Merge order: this PR lands before or with #36099. The gate is version-conditional, so it is a no-op until distroless images of the gated version exist and is safe to land ahead. If #36099 ships nonroot images while this gate is absent, orchestratord applies the wrong UID/GID. Gates:
Separate, pre-existing: balancerd runs distroless (uid 65534) as of v26.18.0 but this gating is not yet merged, so balancerd pods can carry a UID mismatch today unless |
Trim the pod uid/gid comments to the non-obvious facts and drop the release-history narration. Also correct the env/clusterd comment that still said v26.28+ after the gate moved to v26.32. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Distroless images run as nonroot (UID 65534) instead of root. Version-gate the pod security context in orchestratord so it sets the correct runAsUser/runAsGroup based on Materialize version, avoiding UID mismatches during rolling upgrades.
Part of the distroless migration, split from #35859.
Test plan
cargo test -p mz-orchestratordpasses🤖 Generated with Claude Code