Skip to content

design-proposal: migrate kubernetes workers to Talos and split into kubernetes-nodes#8

Merged
Andrei Kvapil (kvaps) merged 3 commits into
cozystack:mainfrom
kvaps:proposal/kubernetes-nodes-split
Jun 24, 2026
Merged

design-proposal: migrate kubernetes workers to Talos and split into kubernetes-nodes#8
Andrei Kvapil (kvaps) merged 3 commits into
cozystack:mainfrom
kvaps:proposal/kubernetes-nodes-split

Conversation

@kvaps

@kvaps Andrei Kvapil (kvaps) commented May 4, 2026

Copy link
Copy Markdown
Member

Summary

A two-phase reshape of the kubernetes application:

  • Phase 1: replace Ubuntu+kubeadm worker bootstrap with Talos + cluster-api-bootstrap-provider-talos, inside the existing monolithic chart. No user-facing API change. Seamless migration via standard CAPI MachineDeployment rolling update. Requires a patch to clastix/cluster-api-control-plane-provider-kamaji to render the talos-csr-signer sidecar in the TenantControlPlane.
  • Phase 2: once workers are uniformly Talos, split the chart into kubernetes (control-plane only) + kubernetes-nodes (one per pool), modelled on the vm-instance / vm-disk precedent. Single backend (kubevirt-talos), no backend.type field yet.

Hybrid clusters (workers in external clouds, BYO, bare-metal) are deferred as Phase 3 to a separate placeholder draft: #9.

Supersedes an earlier scope that tried to land multiple backends and the split in one shot — see commit history for the rescoping rationale.

Test plan

Implementation testing is scoped in the proposal:

  • Phase 1 unit: Talos machineconfig generation, MachineDeployment shape, signer sidecar wiring
  • Phase 1 integration: kind + KubeVirt + Kamaji + CABPT + signer, end-to-end Talos worker join + talosctl
  • Phase 1 migration: synthetic Ubuntu+kubeadm tenant rolled over to Talos via CAPI rolling update
  • Phase 2 unit: kubernetes-nodes chart rendering
  • Phase 2 migration: synthetic kubernetes HelmRelease with nodeGroups migrated to split shape

Summary by CodeRabbit

  • Documentation
    • Added design proposal outlining the planned evolution of Kubernetes node management architecture. Details a multi-phase transition roadmap including migration procedures, upgrade and rollback compatibility, security considerations, and testing coverage.

Propose extracting node pools from the kubernetes application into a
sibling kubernetes-nodes application, modelled on the vm-instance/vm-disk
split. Add a backend abstraction that supports the existing
KubeVirt+kubeadm flow alongside new Talos backends: KubeVirt+Talos via
clastix/talos-csr-signer, and cloud-talos for Hetzner and Azure without
Cluster API.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request proposes a significant architectural change to split the monolithic kubernetes package into separate control-plane and node-pool components, enabling independent lifecycles and multi-backend support for KubeVirt and cloud-native Talos workers. The review feedback identifies a technical error regarding the gRPC protocol (TCP vs UDP), suggests consolidating the node-lifecycle-controller and Talos tokens at the cluster level for better efficiency and simpler configuration, and requests further implementation details for the dependency discovery mechanism in the proposed admission webhook.


Renders the same CAPI/CAPK objects as above, but with a `TalosConfigTemplate` (from `cluster-api-bootstrap-provider-talos`) replacing `KubeadmConfigTemplate`. Worker VMs boot from a Talos image. Bootstrap fetches the Talos machineconfig from CAPI and joins the cluster via standard Talos PKI.

The tenant's `KamajiControlPlane` carries an `additionalContainers` entry running `clastix/talos-csr-signer` listening on UDP/50001, exposed alongside `:6443` on the tenant API LoadBalancer. This is what allows `talosctl` to operate against worker nodes whose control-plane is Kamaji rather than Talos.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The proposal mentions clastix/talos-csr-signer listening on UDP/50001. Since this is a gRPC-based service (as noted in line 40), it should be TCP/50001 to match standard Talos trustd and gRPC requirements.


When `cluster-autoscaler` scales a `cloud-talos-*` pool down, it deletes the cloud VM. The tenant's apiserver still has a `Node` object that will linger until something deletes it. CAPI was previously the agent doing this; without CAPI, we need an equivalent.

The `node-lifecycle-controller` from `cozystack/local-ccm` is a good fit for this role. The `kubernetes-nodes` chart for `cloud-talos-*` backends renders an NLC Deployment that runs in the management cluster but uses a kubeconfig pointing to the **tenant** apiserver. It watches Node objects with the `ToBeDeletedByClusterAutoscaler:NoSchedule` taint and removes them after a configurable grace period and unreachability check.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Deploying a separate node-lifecycle-controller (NLC) for each kubernetes-nodes release leads to redundant processes watching the same tenant API server. A more efficient approach would be to manage a single NLC instance per tenant cluster (e.g., within the kubernetes control-plane chart) that handles node cleanup for all associated node pools.

## Failure and edge cases

- **`kubernetes-nodes` HelmRelease created before its parent `kubernetes` HelmRelease** → chart `fail`s the render with a clear error message identifying the missing parent. No partial CAPI/autoscaler resources created.
- **Parent `kubernetes` HelmRelease deleted while children exist** → all `kubernetes-nodes` HelmReleases for that cluster fail subsequent reconciles. An admission webhook on `kubernetes` HelmRelease delete blocks the operation if any `kubernetes-nodes` references it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The proposed admission webhook for blocking kubernetes HelmRelease deletion requires a mechanism to discover name-based dependencies. Clarifying where this logic will reside and how it will perform discovery would strengthen the proposal, especially since the linkage doesn't use standard Kubernetes owner references.

**System layer (chart-managed, not exposed to user):**

- Cluster CA, machine CA, apiserver endpoint — read at template time via `lookup` from the tenant's `KamajiControlPlane`.
- Talos token — generated once per pool, stored alongside the machineconfig.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Generating a TALOS_TOKEN per pool while running a single talos-csr-signer sidecar per cluster (as discussed in Open Question #3 on line 253) introduces a configuration challenge for the signer, as it must be aware of all active tokens. Using a single TALOS_TOKEN per tenant cluster would simplify the sidecar configuration and the kubernetes-nodes chart logic while maintaining sufficient isolation between tenants.

@coderabbitai

coderabbitai Bot commented May 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

A new design proposal document is added at design-proposals/kubernetes-nodes-split/README.md. It describes a two-phase evolution: Phase 1 migrates worker bootstrap from kubeadm to Talos (TalosConfigTemplate, Talos VM image, talos-csr-signer sidecar, Kamaji provider patch), and Phase 2 splits the monolithic chart into separate kubernetes (control-plane) and kubernetes-nodes (per-pool) HelmRelease packages with a migration script.

Changes

kubernetes-nodes-split Design Proposal

Layer / File(s) Summary
Proposal context and motivation
design-proposals/kubernetes-nodes-split/README.md
Adds proposal front-matter (title, authors, date, status) and motivation section covering the two-phase plan, deferred Phase 3, related proposals, current architecture, and underlying CAPI/Kamaji/KubeVirt primitives.
Phase 1: Talos bootstrap migration
design-proposals/kubernetes-nodes-split/README.md
Specifies replacing KubeadmConfigTemplate with TalosConfigTemplate, switching VM base image to Talos, adding the talos-csr-signer sidecar with a new trustd port to KamajiControlPlane, the required upstream Kamaji provider patch, the intended CAPI rolling update flow during mixed kubeadm/Talos rollout, and what remains unchanged.
Phase 2: Package split and migration
design-proposals/kubernetes-nodes-split/README.md
Defines the kubernetes (control-plane-only) and kubernetes-nodes (per-pool HelmRelease) package boundary, name-based linkage mechanism, two-layer Talos machineconfig composition, kubevirt-ccm placement, and the idempotent migration script with Helm ownership annotation transfers and helm.sh/resource-policy: keep strategy.
Compatibility, security, testing, rollout, and alternatives
design-proposals/kubernetes-nodes-split/README.md
Documents user-facing impacts per phase, upgrade/rollback compatibility rules, Talos token and bootstrap secret security model, enumerated failure/edge cases, unit/integration/migration test coverage plan, step-by-step rollout sequence, open questions, and rejected architectural alternatives.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit hops through phases one and two,
Swapping kubeadm for Talos, fresh and new 🐰
The chart splits apart like a burrow's two doors,
Control-plane here, node-pools on separate floors.
Migration scripts tidy each Helm annotation,
All signed and secured — what a fine constellation! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately summarizes the main changes: migrating kubernetes workers to Talos and splitting the chart into kubernetes-nodes, which directly aligns with the primary objectives of Phase 1 and Phase 2 described in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Andrei Kvapil (kvaps) added a commit to kvaps/cozystack-community that referenced this pull request May 5, 2026
Adjust the proposal to reflect that the controller will be developed as
an independent project under the kilo-io organization, per confirmed
interest from Kilo maintainer @squat. Generalize the CRD from a
tenant-specific TenantMeshLink to a tenant-agnostic ClusterMesh that
references peer clusters through a map of kubeconfig Secrets. Move all
tenant semantics into a dedicated Cozystack integration section that
also accounts for the kubernetes-nodes split (PR cozystack#8) so a single
ClusterMesh covers multi-location, multi-backend tenants.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

@kvaps Andrei Kvapil (kvaps) left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action items from an internal meeting. Capturing here so the design can be revised before this lands.

1. Zone / location ownership model is undefined

The proposal does not specify whether a "location" (zone) is configured by the platform admin or by the tenant. This shapes the entire kubernetes-nodes API:

  • Admin-owned location: we need a separate platform-level resource that declares per-zone settings (provider creds, base image, template, network config). Node pools reference it by name and the user never sees cloud-init contents or instance template details.
  • User-owned location: the tenant gets more freedom to pick instance types, templates, disks — closer to the existing KubeVirt path.

These have very different trust and UX implications. The proposal must pick one (or describe an explicit hybrid) and bake it into the values shape.

2. Base image / template source for autoscaler-managed nodes

For cloud-talos-* backends the proposal describes machineconfig injection but is silent on where the base Talos image for the cloud comes from: pre-baked snapshot/VHD, Talos Image Factory schematic computed on the fly, or user-supplied image reference. This needs an explicit field with documented defaults and an operator workflow for adding new images.

3. Credentials when the user brings their own cloud

If we want to support tenants connecting their own cloud accounts (a direction raised in the meeting), the design must cover:

  • Where per-tenant cloud credentials live (tenant-namespace Secret references in the kubernetes-nodes HelmRelease).
  • How the management-cluster cluster-autoscaler reads tenant-specific credentials without privilege escalation.
  • Trust boundary: a compromised tenant must not be able to leak or repurpose those credentials.

The current proposal implicitly assumes platform-supplied credentials. Add an explicit section on tenant-supplied credentials, or push to an open question with a clear statement.

4. Scheduling class semantics break for external backends

tenant.spec.scheduling (allowed and default scheduling classes) is enforced when KubeVirt VMs are scheduled in the management cluster. For cloud-talos-* backends there is no management-cluster pod, so the existing scheduling-class machinery has nothing to act on.

The proposal should:

  • State explicitly that scheduling classes only apply to kubevirt-* backends.
  • Decide whether equivalent constraints (per-tenant restrictions on which kubernetes-nodes configurations are allowed) should be lifted into kubernetes-nodes itself rather than relying on scheduling classes.

5. Backend abstraction: should it be a real contract, not just an enum?

Today backend.type is a string switch and the chart branches internally. As we add backends (KubeVirt-Talos, AWS, GCP, on-prem KVM), this becomes hard to maintain. Worth considering:

  • A minimal contract any backend must implement (CRs produced, credentials interface, autoscaler integration shape).
  • Documenting it so external contributors can add a backend without reading the entire chart.
  • Possibly per-backend sub-charts.

Not blocking, but should be an explicit open question.

6. UI form generation per backend

The dashboard needs a different form per backend.type. The proposal should mention that the values schema is structured so the frontend can render per-backend forms cleanly (tagged-union shape, not a flat values bag with conditional fields).

7. Tangential: management cluster on non-KubeVirt

The same kubernetes-nodes model could plausibly apply to the management cluster's own node pools (running it on Azure VMs, not just KubeVirt). The existing autoscaling docs already show the infra-level pattern. The proposal should:

  • Explicitly say "out of scope — management cluster node management remains separate", or
  • Note that the design is intentionally compatible with that future direction.

Will revise the proposal to address these points.

Drop the multi-backend design (cloud-talos-hetzner, cloud-talos-azure,
LocationProfile, NLC, etc.) and rewrite around two phases of internal
restructuring:

- Phase 1: replace Ubuntu+kubeadm worker bootstrap with Talos via CABPT,
  inside the existing monolithic chart, with no user-facing API change.
  Patch needed in cluster-api-control-plane-provider-kamaji to render
  the talos-csr-signer sidecar in TenantControlPlane.
- Phase 2: once workers are uniformly Talos, split the chart into
  kubernetes (control-plane) + kubernetes-nodes (per-pool). Single
  backend, no backend.type field.

Hybrid / external-cloud clusters are deferred to Phase 3, tracked
separately as a follow-up draft proposal.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@kvaps Andrei Kvapil (kvaps) changed the title design-proposal: split kubernetes package and add Talos backends design-proposal: migrate kubernetes workers to Talos and split into kubernetes-nodes May 11, 2026
@kvaps Andrei Kvapil (kvaps) marked this pull request as ready for review June 23, 2026 17:20

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@design-proposals/kubernetes-nodes-split/README.md`:
- Around line 59-60: In the README.md file, locate the first bullet point
referencing the clastix/talos-csr-signer sidecar and change the protocol
specification from UDP/50001 to TCP/50001. Talos trustd operates exclusively
over TCP for gRPC-based CSR signing handshakes, so the protocol must be
corrected to reflect the actual TCP-based communication requirement.
- Line 188: Update the discussion in the document around the per-pool versus
cluster-wide talos-csr-signer token choice to better articulate the cross-pool
security risk. Specifically, revise the framing to clarify that a single
cluster-wide token in the KamajiControlPlane Pod creates a broader blast radius
where compromise of one pool's environment (through elevated workload
permissions or host access) enables minting of Talos machine certificates for
other pools. Acknowledge that while per-pool token isolation adds operational
overhead, it provides meaningful security benefit by limiting the blast radius
to individual pool trust boundaries, making this a more substantive trade-off
than currently presented in the document.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3f86b2d3-4793-4f67-af20-8871479dff8a

📥 Commits

Reviewing files that changed from the base of the PR and between abb40d0 and 017cf28.

📒 Files selected for processing (1)
  • design-proposals/kubernetes-nodes-split/README.md

Comment on lines +59 to +60
- The tenant's `KamajiControlPlane` gains a sidecar entry running `clastix/talos-csr-signer`, exposed alongside the API server on the tenant API endpoint (UDP/50001 for Talos `trustd`).
- `KamajiControlPlane` exposed-ports configuration extended to include `:50001` alongside `:6443` so the tenant API LoadBalancer Service surfaces both.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Talos trustd protocol UDP TCP port 50001

💡 Result:

In Talos Linux, trustd is a system daemon responsible for managing certificate operations and establishing trust within a cluster [1][2]. It acts as a certificate signing authority on control plane nodes [1][3]. When a new worker node joins a cluster, it sends a certificate signing request (CSR) to the trustd service on a control plane node; upon authentication via a machine token, trustd signs the request using the cluster's operating system certificate authority (CA) and returns the signed certificate to the worker node [1]. The trustd service communicates over TCP port 50001 [4][1][5]. This port is used by worker nodes to reach the control plane for certificate signing [4][1]. Because this port is essential for cluster bootstrapping and node joining, it is not currently configurable [4][6]. To ensure stability, Talos reserves ports 50000 (apid) and 50001 (trustd) from the ephemeral port range to prevent them from being occupied by other outgoing connections [7].

Citations:


🌐 Web query:

Talos node enrollment bootstrap gRPC protocol

💡 Result:

In Talos Linux, node enrollment and cluster bootstrapping are API-driven processes facilitated by a gRPC interface [1][2]. There is no "bootstrap protocol" in the traditional sense; instead, the system relies on a secure, authenticated gRPC API exposed by the apid service running on every node [1]. 1. Node Configuration (Enrollment): Before a node can participate in a cluster, it must acquire its machine configuration [3]. This is typically done during the initial boot, where the node retrieves its configuration via mechanisms like kernel arguments (talos.config), disk-embedded state, or an API-driven push from an orchestration platform (e.g., Omni or Cluster API) [3]. Once configured, the node starts the apid service on port 50000, which requires mutual TLS (mTLS) for all subsequent management operations [1][2]. 2. Cluster Bootstrapping: Bootstrapping is the specific operation that initializes the etcd cluster and the Kubernetes control plane on the first designated control plane node [4]. This is performed by issuing a Bootstrap gRPC request to the node's API [5][6]. - The operation is triggered via talosctl bootstrap or through an equivalent programmatic gRPC call [2][6]. - The request is sent to the sidero.talos.machine.v1.MachineService (or similar depending on the version/resource definition) [5]. - Upon receiving this request, the node initializes etcd, starts the Kubernetes API server, and brings up the remaining control plane components [4]. 3. Discovery: Separately, Talos nodes use a cluster discovery mechanism to find peers before and during bootstrap [7][8]. This uses a distinct gRPC service (often sidero.discovery.server.Cluster) where nodes register their existence and query for other cluster members [9]. Discovery is vital for formation but is conceptually distinct from the administrative bootstrap operation that triggers etcd initialization [7][4]. All management interaction, including bootstrapping, is authenticated using mTLS with certificates generated during the cluster's initial PKI setup [1][2]. The Talos machinery SDK (github.com/siderolabs/talos/pkg/machinery) provides the official gRPC client wrappers used to implement these operations programmatically [10].

Citations:


🏁 Script executed:

cd design-proposals/kubernetes-nodes-split && cat -n README.md | sed -n '50,70p'

Repository: cozystack/community

Length of output: 1807


🏁 Script executed:

cd design-proposals/kubernetes-nodes-split && cat -n README.md | sed -n '180,195p'

Repository: cozystack/community

Length of output: 1967


Fix incorrect Talos trustd protocol specification.

The endpoint must be TCP/50001, not UDP/50001. Talos trustd (and the gRPC-based CSR signing handshake) operates exclusively over TCP; UDP would break node enrollment.

Suggested fix
- ... exposed alongside the API server on the tenant API endpoint (UDP/50001 for Talos `trustd`).
+ ... exposed alongside the API server on the tenant API endpoint (TCP/50001 for Talos `trustd`).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- The tenant's `KamajiControlPlane` gains a sidecar entry running `clastix/talos-csr-signer`, exposed alongside the API server on the tenant API endpoint (UDP/50001 for Talos `trustd`).
- `KamajiControlPlane` exposed-ports configuration extended to include `:50001` alongside `:6443` so the tenant API LoadBalancer Service surfaces both.
- The tenant's `KamajiControlPlane` gains a sidecar entry running `clastix/talos-csr-signer`, exposed alongside the API server on the tenant API endpoint (TCP/50001 for Talos `trustd`).
- `KamajiControlPlane` exposed-ports configuration extended to include `:50001` alongside `:6443` so the tenant API LoadBalancer Service surfaces both.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@design-proposals/kubernetes-nodes-split/README.md` around lines 59 - 60, In
the README.md file, locate the first bullet point referencing the
clastix/talos-csr-signer sidecar and change the protocol specification from
UDP/50001 to TCP/50001. Talos trustd operates exclusively over TCP for
gRPC-based CSR signing handshakes, so the protocol must be corrected to reflect
the actual TCP-based communication requirement.

## Open questions

1. **Kamaji provider patch upstream timeline.** Aim to merge into `clastix/cluster-api-control-plane-provider-kamaji`. Cozystack carries a fork in the meantime. Track issue/PR link here once filed.
2. **Per-pool talos-csr-signer vs cluster-wide.** Currently proposed as a single sidecar in the tenant's `KamajiControlPlane` Pod (cluster-wide token). Should each pool have its own token for blast-radius isolation? Operationally heavier; security gain limited because tokens only grant the right to obtain a Talos machine cert, not Kubernetes API access.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "README.md" -path "*kubernetes-nodes-split*" -type f

Repository: cozystack/community

Length of output: 209


🏁 Script executed:

cat -n ./design-proposals/kubernetes-nodes-split/README.md | sed -n '170,210p'

Repository: cozystack/community

Length of output: 4066


🏁 Script executed:

# Get more context about the talos-csr-signer and the broader architecture
head -n 187 ./design-proposals/kubernetes-nodes-split/README.md | tail -n 100

Repository: cozystack/community

Length of output: 9205


🏁 Script executed:

# Let's search for more details about how the talos-csr-signer token is used and what capabilities it provides
grep -n "talos-csr-signer\|TALOS_TOKEN\|token\|security" ./design-proposals/kubernetes-nodes-split/README.md | head -30

Repository: cozystack/community

Length of output: 3204


🏁 Script executed:

# Look at the entire Security section and context around token usage
sed -n '155,160p' ./design-proposals/kubernetes-nodes-split/README.md

Repository: cozystack/community

Length of output: 805


🌐 Web query:

Talos trustd token security implications machine certificate authority

💡 Result:

In the context of Talos Linux, trustd is a critical system service responsible for certificate issuance and renewal for worker nodes [1]. It acts as a Certificate Authority (CA) client, enabling worker nodes to authenticate with the control plane and obtain necessary certificates for Talos API communications [1]. The security implications of the trustd service and its associated join token are significant: 1. Privilege Escalation via Join Tokens: The join token, stored in the machine configuration on worker nodes, is used to authenticate a node to the trustd service [2][1]. If an attacker gains unauthorized access to this token (e.g., via a misconfigured Kubernetes workload with hostPath mounts), they could potentially construct a malicious Certificate Signing Request (CSR) [2]. 2. Historical Vulnerabilities: A notable vulnerability, CVE-2022-36103, demonstrated that older versions of trustd failed to properly validate these CSRs, allowing an attacker to request certificates with elevated RBAC roles, such as os:admin, which grant full control over the Talos API and cluster [3][4]. This was addressed by ensuring trustd no longer signs client authentication requests or allows certain fields (like the Organization field) that could lead to unauthorized privilege escalation [3]. 3. Protection of the CA Key: The trustd service on control plane nodes manages the underlying CA private key [1]. Because possession of this key allows for the issuance of certificates for any identity within the cluster, it is highly sensitive [1]. Talos protects this by running within an immutable operating system, restricting shell access, and managing the key through controlled, authenticated API calls rather than flat file exposure [1]. 4. Security Best Practices: To mitigate these risks, it is recommended to enable Pod Security Standards to prevent unauthorized access to host resources, secure the delivery of machine configuration (avoiding insecure cloud metadata paths), and ensure that Talos nodes are kept updated to receive security patches related to certificate issuance logic [2][4]. Note: The term trustd is also used by other unrelated projects, including mobile security solutions [5][6] and various cryptocurrency tokens [7][8], which are distinct from the Talos Linux system service.

Citations:


Cluster-wide talos-csr-signer token creates broader blast radius across tenant pools.

A single shared token in the KamajiControlPlane Pod allows any compromise of one pool's environment to mint machine certificates for the entire tenant cluster. If a workload with elevated permissions (cluster-admin, secret reader, or host access) is compromised, the attacker can extract the cluster-wide token and use it to bootstrap or hijack machines in other pools. The document's framing—that tokens "only grant the right to obtain a Talos machine cert"—understates the cross-pool risk. Per-pool token isolation does impose operational overhead, but it meaningfully limits the blast radius to a single pool, which is the real trust boundary here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@design-proposals/kubernetes-nodes-split/README.md` at line 188, Update the
discussion in the document around the per-pool versus cluster-wide
talos-csr-signer token choice to better articulate the cross-pool security risk.
Specifically, revise the framing to clarify that a single cluster-wide token in
the KamajiControlPlane Pod creates a broader blast radius where compromise of
one pool's environment (through elevated workload permissions or host access)
enables minting of Talos machine certificates for other pools. Acknowledge that
while per-pool token isolation adds operational overhead, it provides meaningful
security benefit by limiting the blast radius to individual pool trust
boundaries, making this a more substantive trade-off than currently presented in
the document.

@kvaps Andrei Kvapil (kvaps) merged commit fbfc6ba into cozystack:main Jun 24, 2026
2 checks passed
myasnikovdaniil added a commit to cozystack/cozystack that referenced this pull request Jun 26, 2026
## What this PR does

> **Supersedes #2610.** Same work, rebased cleanly onto current `main`
so it now sits on top of #2867 (the `kube-controller-manager` VAP
type-checker fix). That fix is what was killing the base install in
#2610's E2E *before* the `kubernetes` app test ever ran, leaving the
Talos worker-bootstrap rewrite verified only by code reading. #2610's
67-commit branch is squashed here into 8 logical commits with original
authorship preserved (@kvaps and @IvanHunters as authors); the tree is
**byte-identical** to #2610's head — no code changes, just the rebase
onto `main` (to pick up #2867) plus history cleanup. The detailed review
history lives on #2610.

Phase 1 of the Kubernetes-app split design (see cozystack/community#8):
replace the Ubuntu+kubeadm worker bootstrap path with Talos, driven by
`cluster-api-bootstrap-provider-talos` (CABPT) and a
`clastix/talos-csr-signer` sidecar embedded in the Kamaji control-plane
pod. Existing tenants keep working — old machines roll out and get
replaced by Talos workers without manual intervention.

Highlights:

- Add CABPT (v0.6.12) as a second `BootstrapProvider` alongside the
existing kubeadm one in `capi-providers-bootstrap`.
- Bump the Kamaji control-plane provider with an upstream-bound patch
that exposes `KamajiControlPlane.spec.network.additionalServicePorts`,
used to publish trustd (50001/TCP) on the apiserver service.
- Generate Talos PKI (Ed25519 CA + trustd TLS) via cert-manager and
stable random Talos secrets (`token`, `clusterId`, `clusterSecret`,
`bootstrapToken`) using the helm lookup-and-reuse pattern.
- Materialise a kubeadm-format `bootstrap-token-<id>` Secret inside the
tenant `kube-system` via a Helm post-install/upgrade Job.
- Render a `TalosConfigTemplate` (worker machineconfig, `generateType:
none`) and switch the `MachineDeployment.bootstrap` reference from
`KubeadmConfigTemplate` to `TalosConfigTemplate`, gated on Talos secrets
and the Kamaji apiserver service being ready.
- Boot workers from the Talos openstack image via a CDI `DataVolume`
(`source.http.url` → `factory.talos.dev`), expose the system disk as
virtio-blk with `blockSize.custom: logical=512, physical=4096` so
4Ki-native block storage backends (LINSTOR/DRBD) play nicely with QEMU's
O_DIRECT writes and SeaBIOS still boots, drop the separate kubelet disk
(Talos lays out EPHEMERAL itself).
- Pin pod/service CIDRs in the worker machineconfig to the tenant ranges
(`10.243.0.0/16` / `10.95.0.0/16`) to avoid Talos's address-overlap
diagnostic against the host pod CIDR.
- Set `cluster.controlPlane.endpoint` and `cilium.k8sServiceHost` to the
Kamaji apiserver Service ClusterIP (looked up at render time), so
bootstrap survives the chicken-and-egg moment where tenant DNS does not
exist yet and the host CoreDNS does not serve the tenant zone.

End-to-end on a dev cluster: a fresh tenant brings up a Talos worker,
kubelet CSRs get approved (apiserver-client by Kamaji, kubelet-serving
by the talos-csr-signer sidecar), Cilium initialises with hostNetwork
against the apiserver ClusterIP, CoreDNS comes up, and the node
transitions to `Ready`.

### Release note

```release-note
feat(kubernetes): bootstrap tenant workers with Talos Linux instead of Ubuntu+kubeadm. Existing clusters roll over to Talos workers automatically; new tenants come up on Talos from the start. Powered by cluster-api-bootstrap-provider-talos and the talos-csr-signer sidecar in the Kamaji control-plane pod. Note: the tenant `kubernetes` HelmRelease now reports Ready as soon as Helm completes (Install/Upgrade DisableWait) and no longer blocks on worker/addon readiness — worker-rollout health is tracked by WorkloadMonitor, not the HelmRelease.
```


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

**New Features**
* Added Talos worker configuration for `talos.version` and
`talos.schematicID`
* Introduced `nodeHealthCheck` (`maxUnhealthy`, `nodeStartupTimeout`)
for worker remediation tuning
* Added image overrides for `kubectl` and `talosCsrSigner`
* Exposed additional tenant control-plane Service ports support

**Breaking Changes**
* Removed Kubernetes v1.30; supported versions start at v1.31
* Renamed `nodeGroups.ephemeralStorage` → `nodeGroups.diskSize` with
consolidated Talos system-disk semantics

**Removals**
* Removed Ubuntu container-disk images and related containerd patch
mechanisms
* Air-gapped tenant workers temporarily unsupported during the Talos
transition

**Documentation**
* Updated chart docs, parameters, and GitOps upgrade guidance for
version bumps
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant