design-proposal: migrate kubernetes workers to Talos and split into kubernetes-nodes#8
Conversation
Propose extracting node pools from the kubernetes application into a sibling kubernetes-nodes application, modelled on the vm-instance/vm-disk split. Add a backend abstraction that supports the existing KubeVirt+kubeadm flow alongside new Talos backends: KubeVirt+Talos via clastix/talos-csr-signer, and cloud-talos for Hetzner and Azure without Cluster API. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request proposes a significant architectural change to split the monolithic kubernetes package into separate control-plane and node-pool components, enabling independent lifecycles and multi-backend support for KubeVirt and cloud-native Talos workers. The review feedback identifies a technical error regarding the gRPC protocol (TCP vs UDP), suggests consolidating the node-lifecycle-controller and Talos tokens at the cluster level for better efficiency and simpler configuration, and requests further implementation details for the dependency discovery mechanism in the proposed admission webhook.
|
|
||
| Renders the same CAPI/CAPK objects as above, but with a `TalosConfigTemplate` (from `cluster-api-bootstrap-provider-talos`) replacing `KubeadmConfigTemplate`. Worker VMs boot from a Talos image. Bootstrap fetches the Talos machineconfig from CAPI and joins the cluster via standard Talos PKI. | ||
|
|
||
| The tenant's `KamajiControlPlane` carries an `additionalContainers` entry running `clastix/talos-csr-signer` listening on UDP/50001, exposed alongside `:6443` on the tenant API LoadBalancer. This is what allows `talosctl` to operate against worker nodes whose control-plane is Kamaji rather than Talos. |
|
|
||
| When `cluster-autoscaler` scales a `cloud-talos-*` pool down, it deletes the cloud VM. The tenant's apiserver still has a `Node` object that will linger until something deletes it. CAPI was previously the agent doing this; without CAPI, we need an equivalent. | ||
|
|
||
| The `node-lifecycle-controller` from `cozystack/local-ccm` is a good fit for this role. The `kubernetes-nodes` chart for `cloud-talos-*` backends renders an NLC Deployment that runs in the management cluster but uses a kubeconfig pointing to the **tenant** apiserver. It watches Node objects with the `ToBeDeletedByClusterAutoscaler:NoSchedule` taint and removes them after a configurable grace period and unreachability check. |
There was a problem hiding this comment.
Deploying a separate node-lifecycle-controller (NLC) for each kubernetes-nodes release leads to redundant processes watching the same tenant API server. A more efficient approach would be to manage a single NLC instance per tenant cluster (e.g., within the kubernetes control-plane chart) that handles node cleanup for all associated node pools.
| ## Failure and edge cases | ||
|
|
||
| - **`kubernetes-nodes` HelmRelease created before its parent `kubernetes` HelmRelease** → chart `fail`s the render with a clear error message identifying the missing parent. No partial CAPI/autoscaler resources created. | ||
| - **Parent `kubernetes` HelmRelease deleted while children exist** → all `kubernetes-nodes` HelmReleases for that cluster fail subsequent reconciles. An admission webhook on `kubernetes` HelmRelease delete blocks the operation if any `kubernetes-nodes` references it. |
There was a problem hiding this comment.
The proposed admission webhook for blocking kubernetes HelmRelease deletion requires a mechanism to discover name-based dependencies. Clarifying where this logic will reside and how it will perform discovery would strengthen the proposal, especially since the linkage doesn't use standard Kubernetes owner references.
| **System layer (chart-managed, not exposed to user):** | ||
|
|
||
| - Cluster CA, machine CA, apiserver endpoint — read at template time via `lookup` from the tenant's `KamajiControlPlane`. | ||
| - Talos token — generated once per pool, stored alongside the machineconfig. |
There was a problem hiding this comment.
Generating a TALOS_TOKEN per pool while running a single talos-csr-signer sidecar per cluster (as discussed in Open Question #3 on line 253) introduces a configuration challenge for the signer, as it must be aware of all active tokens. Using a single TALOS_TOKEN per tenant cluster would simplify the sidecar configuration and the kubernetes-nodes chart logic while maintaining sufficient isolation between tenants.
📝 WalkthroughWalkthroughA new design proposal document is added at Changeskubernetes-nodes-split Design Proposal
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Adjust the proposal to reflect that the controller will be developed as an independent project under the kilo-io organization, per confirmed interest from Kilo maintainer @squat. Generalize the CRD from a tenant-specific TenantMeshLink to a tenant-agnostic ClusterMesh that references peer clusters through a map of kubeconfig Secrets. Move all tenant semantics into a dedicated Cozystack integration section that also accounts for the kubernetes-nodes split (PR cozystack#8) so a single ClusterMesh covers multi-location, multi-backend tenants. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Andrei Kvapil (kvaps)
left a comment
There was a problem hiding this comment.
Action items from an internal meeting. Capturing here so the design can be revised before this lands.
1. Zone / location ownership model is undefined
The proposal does not specify whether a "location" (zone) is configured by the platform admin or by the tenant. This shapes the entire kubernetes-nodes API:
- Admin-owned location: we need a separate platform-level resource that declares per-zone settings (provider creds, base image, template, network config). Node pools reference it by name and the user never sees cloud-init contents or instance template details.
- User-owned location: the tenant gets more freedom to pick instance types, templates, disks — closer to the existing KubeVirt path.
These have very different trust and UX implications. The proposal must pick one (or describe an explicit hybrid) and bake it into the values shape.
2. Base image / template source for autoscaler-managed nodes
For cloud-talos-* backends the proposal describes machineconfig injection but is silent on where the base Talos image for the cloud comes from: pre-baked snapshot/VHD, Talos Image Factory schematic computed on the fly, or user-supplied image reference. This needs an explicit field with documented defaults and an operator workflow for adding new images.
3. Credentials when the user brings their own cloud
If we want to support tenants connecting their own cloud accounts (a direction raised in the meeting), the design must cover:
- Where per-tenant cloud credentials live (tenant-namespace Secret references in the
kubernetes-nodesHelmRelease). - How the management-cluster
cluster-autoscalerreads tenant-specific credentials without privilege escalation. - Trust boundary: a compromised tenant must not be able to leak or repurpose those credentials.
The current proposal implicitly assumes platform-supplied credentials. Add an explicit section on tenant-supplied credentials, or push to an open question with a clear statement.
4. Scheduling class semantics break for external backends
tenant.spec.scheduling (allowed and default scheduling classes) is enforced when KubeVirt VMs are scheduled in the management cluster. For cloud-talos-* backends there is no management-cluster pod, so the existing scheduling-class machinery has nothing to act on.
The proposal should:
- State explicitly that scheduling classes only apply to
kubevirt-*backends. - Decide whether equivalent constraints (per-tenant restrictions on which
kubernetes-nodesconfigurations are allowed) should be lifted intokubernetes-nodesitself rather than relying on scheduling classes.
5. Backend abstraction: should it be a real contract, not just an enum?
Today backend.type is a string switch and the chart branches internally. As we add backends (KubeVirt-Talos, AWS, GCP, on-prem KVM), this becomes hard to maintain. Worth considering:
- A minimal contract any backend must implement (CRs produced, credentials interface, autoscaler integration shape).
- Documenting it so external contributors can add a backend without reading the entire chart.
- Possibly per-backend sub-charts.
Not blocking, but should be an explicit open question.
6. UI form generation per backend
The dashboard needs a different form per backend.type. The proposal should mention that the values schema is structured so the frontend can render per-backend forms cleanly (tagged-union shape, not a flat values bag with conditional fields).
7. Tangential: management cluster on non-KubeVirt
The same kubernetes-nodes model could plausibly apply to the management cluster's own node pools (running it on Azure VMs, not just KubeVirt). The existing autoscaling docs already show the infra-level pattern. The proposal should:
- Explicitly say "out of scope — management cluster node management remains separate", or
- Note that the design is intentionally compatible with that future direction.
Will revise the proposal to address these points.
Drop the multi-backend design (cloud-talos-hetzner, cloud-talos-azure, LocationProfile, NLC, etc.) and rewrite around two phases of internal restructuring: - Phase 1: replace Ubuntu+kubeadm worker bootstrap with Talos via CABPT, inside the existing monolithic chart, with no user-facing API change. Patch needed in cluster-api-control-plane-provider-kamaji to render the talos-csr-signer sidecar in TenantControlPlane. - Phase 2: once workers are uniformly Talos, split the chart into kubernetes (control-plane) + kubernetes-nodes (per-pool). Single backend, no backend.type field. Hybrid / external-cloud clusters are deferred to Phase 3, tracked separately as a follow-up draft proposal. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@design-proposals/kubernetes-nodes-split/README.md`:
- Around line 59-60: In the README.md file, locate the first bullet point
referencing the clastix/talos-csr-signer sidecar and change the protocol
specification from UDP/50001 to TCP/50001. Talos trustd operates exclusively
over TCP for gRPC-based CSR signing handshakes, so the protocol must be
corrected to reflect the actual TCP-based communication requirement.
- Line 188: Update the discussion in the document around the per-pool versus
cluster-wide talos-csr-signer token choice to better articulate the cross-pool
security risk. Specifically, revise the framing to clarify that a single
cluster-wide token in the KamajiControlPlane Pod creates a broader blast radius
where compromise of one pool's environment (through elevated workload
permissions or host access) enables minting of Talos machine certificates for
other pools. Acknowledge that while per-pool token isolation adds operational
overhead, it provides meaningful security benefit by limiting the blast radius
to individual pool trust boundaries, making this a more substantive trade-off
than currently presented in the document.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3f86b2d3-4793-4f67-af20-8871479dff8a
📒 Files selected for processing (1)
design-proposals/kubernetes-nodes-split/README.md
| - The tenant's `KamajiControlPlane` gains a sidecar entry running `clastix/talos-csr-signer`, exposed alongside the API server on the tenant API endpoint (UDP/50001 for Talos `trustd`). | ||
| - `KamajiControlPlane` exposed-ports configuration extended to include `:50001` alongside `:6443` so the tenant API LoadBalancer Service surfaces both. |
There was a problem hiding this comment.
🩺 Stability & Availability | 🔴 Critical
🧩 Analysis chain
🌐 Web query:
Talos trustd protocol UDP TCP port 50001
💡 Result:
In Talos Linux, trustd is a system daemon responsible for managing certificate operations and establishing trust within a cluster [1][2]. It acts as a certificate signing authority on control plane nodes [1][3]. When a new worker node joins a cluster, it sends a certificate signing request (CSR) to the trustd service on a control plane node; upon authentication via a machine token, trustd signs the request using the cluster's operating system certificate authority (CA) and returns the signed certificate to the worker node [1]. The trustd service communicates over TCP port 50001 [4][1][5]. This port is used by worker nodes to reach the control plane for certificate signing [4][1]. Because this port is essential for cluster bootstrapping and node joining, it is not currently configurable [4][6]. To ensure stability, Talos reserves ports 50000 (apid) and 50001 (trustd) from the ephemeral port range to prevent them from being occupied by other outgoing connections [7].
Citations:
- 1: https://oneuptime.com/blog/post/2026-03-03-understand-the-trustd-service-in-talos-linux/view
- 2: https://siderolabs-fe86397c.mintlify.app/talos/v1.13/learn-more/components
- 3: https://oneuptime.com/blog/post/2026-03-03-understand-talos-linux-trust-and-pki-model/view
- 4: https://siderolabs-fe86397c.mintlify.app/talos/v1.13/learn-more/talos-network-connectivity
- 5: https://oneuptime.com/blog/post/2026-03-03-configure-firewall-rules-in-talos-linux/view
- 6: https://docs.siderolabs.com/talos/v1.11/learn-more/talos-network-connectivity
- 7: siderolabs/talos@4adcda0f5
🌐 Web query:
Talos node enrollment bootstrap gRPC protocol
💡 Result:
In Talos Linux, node enrollment and cluster bootstrapping are API-driven processes facilitated by a gRPC interface [1][2]. There is no "bootstrap protocol" in the traditional sense; instead, the system relies on a secure, authenticated gRPC API exposed by the apid service running on every node [1]. 1. Node Configuration (Enrollment): Before a node can participate in a cluster, it must acquire its machine configuration [3]. This is typically done during the initial boot, where the node retrieves its configuration via mechanisms like kernel arguments (talos.config), disk-embedded state, or an API-driven push from an orchestration platform (e.g., Omni or Cluster API) [3]. Once configured, the node starts the apid service on port 50000, which requires mutual TLS (mTLS) for all subsequent management operations [1][2]. 2. Cluster Bootstrapping: Bootstrapping is the specific operation that initializes the etcd cluster and the Kubernetes control plane on the first designated control plane node [4]. This is performed by issuing a Bootstrap gRPC request to the node's API [5][6]. - The operation is triggered via talosctl bootstrap or through an equivalent programmatic gRPC call [2][6]. - The request is sent to the sidero.talos.machine.v1.MachineService (or similar depending on the version/resource definition) [5]. - Upon receiving this request, the node initializes etcd, starts the Kubernetes API server, and brings up the remaining control plane components [4]. 3. Discovery: Separately, Talos nodes use a cluster discovery mechanism to find peers before and during bootstrap [7][8]. This uses a distinct gRPC service (often sidero.discovery.server.Cluster) where nodes register their existence and query for other cluster members [9]. Discovery is vital for formation but is conceptually distinct from the administrative bootstrap operation that triggers etcd initialization [7][4]. All management interaction, including bootstrapping, is authenticated using mTLS with certificates generated during the cluster's initial PKI setup [1][2]. The Talos machinery SDK (github.com/siderolabs/talos/pkg/machinery) provides the official gRPC client wrappers used to implement these operations programmatically [10].
Citations:
- 1: https://oneuptime.com/blog/post/2026-03-03-understand-talos-linux-grpc-api-architecture/view
- 2: https://oneuptime.com/blog/post/2026-03-03-understand-the-talos-linux-api-driven-model/view
- 3: https://docs.siderolabs.com/talos/v1.12/configure-your-talos-cluster/system-configuration/acquire
- 4: https://oneuptime.com/blog/post/2026-03-03-bootstrap-kubernetes-on-talos-linux/view
- 5: https://docs.siderolabs.com/talos/v1.9/reference/api
- 6: https://github.com/siderolabs/talos/blob/v1.13.4/pkg/cluster/bootstrap.go
- 7: https://oneuptime.com/blog/post/2026-03-03-configure-cluster-discovery-in-talos-linux/view
- 8: https://oneuptime.com/blog/post/2026-03-03-understand-talos-linux-cluster-discovery/view
- 9: https://oneuptime.com/blog/post/2026-03-03-understand-discovery-service-architecture-in-talos/view
- 10: https://deepwiki.com/siderolabs/cluster-api-control-plane-provider-talos/3.6-talos-linux-integration
🏁 Script executed:
cd design-proposals/kubernetes-nodes-split && cat -n README.md | sed -n '50,70p'Repository: cozystack/community
Length of output: 1807
🏁 Script executed:
cd design-proposals/kubernetes-nodes-split && cat -n README.md | sed -n '180,195p'Repository: cozystack/community
Length of output: 1967
Fix incorrect Talos trustd protocol specification.
The endpoint must be TCP/50001, not UDP/50001. Talos trustd (and the gRPC-based CSR signing handshake) operates exclusively over TCP; UDP would break node enrollment.
Suggested fix
- ... exposed alongside the API server on the tenant API endpoint (UDP/50001 for Talos `trustd`).
+ ... exposed alongside the API server on the tenant API endpoint (TCP/50001 for Talos `trustd`).📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - The tenant's `KamajiControlPlane` gains a sidecar entry running `clastix/talos-csr-signer`, exposed alongside the API server on the tenant API endpoint (UDP/50001 for Talos `trustd`). | |
| - `KamajiControlPlane` exposed-ports configuration extended to include `:50001` alongside `:6443` so the tenant API LoadBalancer Service surfaces both. | |
| - The tenant's `KamajiControlPlane` gains a sidecar entry running `clastix/talos-csr-signer`, exposed alongside the API server on the tenant API endpoint (TCP/50001 for Talos `trustd`). | |
| - `KamajiControlPlane` exposed-ports configuration extended to include `:50001` alongside `:6443` so the tenant API LoadBalancer Service surfaces both. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@design-proposals/kubernetes-nodes-split/README.md` around lines 59 - 60, In
the README.md file, locate the first bullet point referencing the
clastix/talos-csr-signer sidecar and change the protocol specification from
UDP/50001 to TCP/50001. Talos trustd operates exclusively over TCP for
gRPC-based CSR signing handshakes, so the protocol must be corrected to reflect
the actual TCP-based communication requirement.
| ## Open questions | ||
|
|
||
| 1. **Kamaji provider patch upstream timeline.** Aim to merge into `clastix/cluster-api-control-plane-provider-kamaji`. Cozystack carries a fork in the meantime. Track issue/PR link here once filed. | ||
| 2. **Per-pool talos-csr-signer vs cluster-wide.** Currently proposed as a single sidecar in the tenant's `KamajiControlPlane` Pod (cluster-wide token). Should each pool have its own token for blast-radius isolation? Operationally heavier; security gain limited because tokens only grant the right to obtain a Talos machine cert, not Kubernetes API access. |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
find . -name "README.md" -path "*kubernetes-nodes-split*" -type fRepository: cozystack/community
Length of output: 209
🏁 Script executed:
cat -n ./design-proposals/kubernetes-nodes-split/README.md | sed -n '170,210p'Repository: cozystack/community
Length of output: 4066
🏁 Script executed:
# Get more context about the talos-csr-signer and the broader architecture
head -n 187 ./design-proposals/kubernetes-nodes-split/README.md | tail -n 100Repository: cozystack/community
Length of output: 9205
🏁 Script executed:
# Let's search for more details about how the talos-csr-signer token is used and what capabilities it provides
grep -n "talos-csr-signer\|TALOS_TOKEN\|token\|security" ./design-proposals/kubernetes-nodes-split/README.md | head -30Repository: cozystack/community
Length of output: 3204
🏁 Script executed:
# Look at the entire Security section and context around token usage
sed -n '155,160p' ./design-proposals/kubernetes-nodes-split/README.mdRepository: cozystack/community
Length of output: 805
🌐 Web query:
Talos trustd token security implications machine certificate authority
💡 Result:
In the context of Talos Linux, trustd is a critical system service responsible for certificate issuance and renewal for worker nodes [1]. It acts as a Certificate Authority (CA) client, enabling worker nodes to authenticate with the control plane and obtain necessary certificates for Talos API communications [1]. The security implications of the trustd service and its associated join token are significant: 1. Privilege Escalation via Join Tokens: The join token, stored in the machine configuration on worker nodes, is used to authenticate a node to the trustd service [2][1]. If an attacker gains unauthorized access to this token (e.g., via a misconfigured Kubernetes workload with hostPath mounts), they could potentially construct a malicious Certificate Signing Request (CSR) [2]. 2. Historical Vulnerabilities: A notable vulnerability, CVE-2022-36103, demonstrated that older versions of trustd failed to properly validate these CSRs, allowing an attacker to request certificates with elevated RBAC roles, such as os:admin, which grant full control over the Talos API and cluster [3][4]. This was addressed by ensuring trustd no longer signs client authentication requests or allows certain fields (like the Organization field) that could lead to unauthorized privilege escalation [3]. 3. Protection of the CA Key: The trustd service on control plane nodes manages the underlying CA private key [1]. Because possession of this key allows for the issuance of certificates for any identity within the cluster, it is highly sensitive [1]. Talos protects this by running within an immutable operating system, restricting shell access, and managing the key through controlled, authenticated API calls rather than flat file exposure [1]. 4. Security Best Practices: To mitigate these risks, it is recommended to enable Pod Security Standards to prevent unauthorized access to host resources, secure the delivery of machine configuration (avoiding insecure cloud metadata paths), and ensure that Talos nodes are kept updated to receive security patches related to certificate issuance logic [2][4]. Note: The term trustd is also used by other unrelated projects, including mobile security solutions [5][6] and various cryptocurrency tokens [7][8], which are distinct from the Talos Linux system service.
Citations:
- 1: https://oneuptime.com/blog/post/2026-03-03-understand-the-trustd-service-in-talos-linux/view
- 2: GHSA-7hgc-php5-77qq
- 3: siderolabs/talos@9eaf33f
- 4: https://nvd.nist.gov/vuln/detail/CVE-2022-36103
- 5: https://traced.app/zero-trust-mobile-security/
- 6: https://appsource.microsoft.com/en-us/product/web-apps/tracedltd1617114857192.trustd_mtd?tab=overview
- 7: https://www.steemit.com/cryptocurrency/@robin.evergreen/trustd-ted-token-details-and-ico-review
- 8: https://thebittimes.com/token-TRUST%2Bdai-BASE-0x57113B114e8690af0AE12bC75F5D7F8c5F36c784.html
Cluster-wide talos-csr-signer token creates broader blast radius across tenant pools.
A single shared token in the KamajiControlPlane Pod allows any compromise of one pool's environment to mint machine certificates for the entire tenant cluster. If a workload with elevated permissions (cluster-admin, secret reader, or host access) is compromised, the attacker can extract the cluster-wide token and use it to bootstrap or hijack machines in other pools. The document's framing—that tokens "only grant the right to obtain a Talos machine cert"—understates the cross-pool risk. Per-pool token isolation does impose operational overhead, but it meaningfully limits the blast radius to a single pool, which is the real trust boundary here.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@design-proposals/kubernetes-nodes-split/README.md` at line 188, Update the
discussion in the document around the per-pool versus cluster-wide
talos-csr-signer token choice to better articulate the cross-pool security risk.
Specifically, revise the framing to clarify that a single cluster-wide token in
the KamajiControlPlane Pod creates a broader blast radius where compromise of
one pool's environment (through elevated workload permissions or host access)
enables minting of Talos machine certificates for other pools. Acknowledge that
while per-pool token isolation adds operational overhead, it provides meaningful
security benefit by limiting the blast radius to individual pool trust
boundaries, making this a more substantive trade-off than currently presented in
the document.
## What this PR does > **Supersedes #2610.** Same work, rebased cleanly onto current `main` so it now sits on top of #2867 (the `kube-controller-manager` VAP type-checker fix). That fix is what was killing the base install in #2610's E2E *before* the `kubernetes` app test ever ran, leaving the Talos worker-bootstrap rewrite verified only by code reading. #2610's 67-commit branch is squashed here into 8 logical commits with original authorship preserved (@kvaps and @IvanHunters as authors); the tree is **byte-identical** to #2610's head — no code changes, just the rebase onto `main` (to pick up #2867) plus history cleanup. The detailed review history lives on #2610. Phase 1 of the Kubernetes-app split design (see cozystack/community#8): replace the Ubuntu+kubeadm worker bootstrap path with Talos, driven by `cluster-api-bootstrap-provider-talos` (CABPT) and a `clastix/talos-csr-signer` sidecar embedded in the Kamaji control-plane pod. Existing tenants keep working — old machines roll out and get replaced by Talos workers without manual intervention. Highlights: - Add CABPT (v0.6.12) as a second `BootstrapProvider` alongside the existing kubeadm one in `capi-providers-bootstrap`. - Bump the Kamaji control-plane provider with an upstream-bound patch that exposes `KamajiControlPlane.spec.network.additionalServicePorts`, used to publish trustd (50001/TCP) on the apiserver service. - Generate Talos PKI (Ed25519 CA + trustd TLS) via cert-manager and stable random Talos secrets (`token`, `clusterId`, `clusterSecret`, `bootstrapToken`) using the helm lookup-and-reuse pattern. - Materialise a kubeadm-format `bootstrap-token-<id>` Secret inside the tenant `kube-system` via a Helm post-install/upgrade Job. - Render a `TalosConfigTemplate` (worker machineconfig, `generateType: none`) and switch the `MachineDeployment.bootstrap` reference from `KubeadmConfigTemplate` to `TalosConfigTemplate`, gated on Talos secrets and the Kamaji apiserver service being ready. - Boot workers from the Talos openstack image via a CDI `DataVolume` (`source.http.url` → `factory.talos.dev`), expose the system disk as virtio-blk with `blockSize.custom: logical=512, physical=4096` so 4Ki-native block storage backends (LINSTOR/DRBD) play nicely with QEMU's O_DIRECT writes and SeaBIOS still boots, drop the separate kubelet disk (Talos lays out EPHEMERAL itself). - Pin pod/service CIDRs in the worker machineconfig to the tenant ranges (`10.243.0.0/16` / `10.95.0.0/16`) to avoid Talos's address-overlap diagnostic against the host pod CIDR. - Set `cluster.controlPlane.endpoint` and `cilium.k8sServiceHost` to the Kamaji apiserver Service ClusterIP (looked up at render time), so bootstrap survives the chicken-and-egg moment where tenant DNS does not exist yet and the host CoreDNS does not serve the tenant zone. End-to-end on a dev cluster: a fresh tenant brings up a Talos worker, kubelet CSRs get approved (apiserver-client by Kamaji, kubelet-serving by the talos-csr-signer sidecar), Cilium initialises with hostNetwork against the apiserver ClusterIP, CoreDNS comes up, and the node transitions to `Ready`. ### Release note ```release-note feat(kubernetes): bootstrap tenant workers with Talos Linux instead of Ubuntu+kubeadm. Existing clusters roll over to Talos workers automatically; new tenants come up on Talos from the start. Powered by cluster-api-bootstrap-provider-talos and the talos-csr-signer sidecar in the Kamaji control-plane pod. Note: the tenant `kubernetes` HelmRelease now reports Ready as soon as Helm completes (Install/Upgrade DisableWait) and no longer blocks on worker/addon readiness — worker-rollout health is tracked by WorkloadMonitor, not the HelmRelease. ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes **New Features** * Added Talos worker configuration for `talos.version` and `talos.schematicID` * Introduced `nodeHealthCheck` (`maxUnhealthy`, `nodeStartupTimeout`) for worker remediation tuning * Added image overrides for `kubectl` and `talosCsrSigner` * Exposed additional tenant control-plane Service ports support **Breaking Changes** * Removed Kubernetes v1.30; supported versions start at v1.31 * Renamed `nodeGroups.ephemeralStorage` → `nodeGroups.diskSize` with consolidated Talos system-disk semantics **Removals** * Removed Ubuntu container-disk images and related containerd patch mechanisms * Air-gapped tenant workers temporarily unsupported during the Talos transition **Documentation** * Updated chart docs, parameters, and GitOps upgrade guidance for version bumps <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
A two-phase reshape of the
kubernetesapplication:cluster-api-bootstrap-provider-talos, inside the existing monolithic chart. No user-facing API change. Seamless migration via standard CAPI MachineDeployment rolling update. Requires a patch toclastix/cluster-api-control-plane-provider-kamajito render thetalos-csr-signersidecar in theTenantControlPlane.kubernetes(control-plane only) +kubernetes-nodes(one per pool), modelled on thevm-instance/vm-diskprecedent. Single backend (kubevirt-talos), nobackend.typefield yet.Hybrid clusters (workers in external clouds, BYO, bare-metal) are deferred as Phase 3 to a separate placeholder draft: #9.
Supersedes an earlier scope that tried to land multiple backends and the split in one shot — see commit history for the rescoping rationale.
Test plan
Implementation testing is scoped in the proposal:
kubernetes-nodeschart renderingkubernetesHelmRelease withnodeGroupsmigrated to split shapeSummary by CodeRabbit