Add initial threat model for substrate#303
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Co-authored-by: Vikas Kumar <skvikas@google.com> Co-authored-by: Oleg Mitrofanov <gooleg@google.com>
90e454b to
554aed5
Compare
|
Thanks for landing this in the repo. For some reason, I couldn't access the Google link that was shared. While reading it, for me part of these threats come from the architecture itself. I cannot share my excalidraw link, but you will find a screenshot. If we move a few things the list gets shorter. I put an alternative in the attached image, the reasoning behind it:
For me the hardest one is 4, the checkpoint and fork and identity part, I do not have a fully clean answer there yet.
|
Tim Hockin (thockin)
left a comment
There was a problem hiding this comment.
It took me a while to get the swing of reading this. I left my early comments, but other than internal/external I don't think the rest matter.
| * **ateom:** Per-worker Pod sidecar, running inside the worker Pod. Ateom sets up "interior" sandboxes in the worker Pod and manages sandbox lifecycle, including image pulls. It currently uses gvisor but Substrate will support multiple microvm solutions. | ||
| * **Worker:** Preprovisioned Pods that actors get scheduled to. | ||
| * **Actor:** The core compute primitive, gets scheduled to/from worker via Run for cold start and Resume for snapshot resume. | ||
| * **Actor IP:** Actor networking is based on Pod networking. Each actor gets the IP of the worker it's currently scheduled to. The ateom has the opportunity to set up additional rules when it sets up interior sandboxes. |
There was a problem hiding this comment.
I'm not sure it matters here, but I am not sure that an actor having "real" pod networking is going to mean anything. All traffic will be captured, so the touchpoint is near zero.
|
|
||
| | GitHub Issue | Priority | Threat | Mitigating Invariants | Suggested Concrete Mitigations | Notes | | ||
| | :---- | :---- | :---- | :---- | :---- | :---- | | ||
| | | Critical | External attacker can access actors over the internet | Access to actors over the external internet is blocked. | Recommend use of infrastructure firewall to limit external ingress/egress in documentation (cloud specific). Use Kubernetes NetworkPolicy to limit external ingress/egress by default. Use additional network policy features depending on CNI. | | |
There was a problem hiding this comment.
I think this can be split up - I see the "internal" section below but no equivalent of this line.
External: Actors, workers, and the atenet routers are not directly exposed to the internet in most clusters. Doing so would require explicit configuration.
Internal: Many kube clusters are managed as first-class citizens of their local network, so peers on the internal network COULD reach then at an IP level. This is not different than any other workload running in Kubernetes.
| | GitHub Issue | Priority | Threat | Mitigating Invariants | Suggested Concrete Mitigations | Notes | | ||
| | :---- | :---- | :---- | :---- | :---- | :---- | | ||
| | | Critical | External attacker can access actors over the internet | Access to actors over the external internet is blocked. | Recommend use of infrastructure firewall to limit external ingress/egress in documentation (cloud specific). Use Kubernetes NetworkPolicy to limit external ingress/egress by default. Use additional network policy features depending on CNI. | | | ||
| | | Critical | External attacker can access nodes over the internet | Access to nodes over the external internet is blocked. | Recommend use of infrastructure firewall to limit external ingress/egress in documentation (cloud specific). Use Kubernetes NetworkPolicy to limit external ingress/egress by default. Use additional network policy features depending on CNI. | | |
There was a problem hiding this comment.
Saying that access is "blocked" in all these cases feels wrong. It feels like it is implying that we did something when in actuality it's the "normal" disposition of Kubernetes. This sort of document is outside my core expertise -- is that distinction important?
| | GitHub Issue | Priority | Threat | Mitigating Invariants | Suggested Concrete Mitigations | Notes | | ||
| | :---- | :---- | :---- | :---- | :---- | :---- | | ||
| | | Critical | Access to the internal network allows arbitrary actions to be performed on ate-apiserver, atelet, substrate backend database, etc. | All system components must have basic mutual authentication and authorization, and communicate over TLS. All clients (including end users and actors) must be authenticated and authorized. Unauthenticated traffic must be rejected. | mTLS or other secure channel (e.g. UDS) between networked system components (ate-apiserver, atelet, ateom, etc) each atelet has a unique identity cryptographically tied to the node identity ate-router should check client permissions before resuming actors or forwarding traffic to actors. The only component authorized to connect directly to the backend database should be ate-apiserver. | | | ||
| | | High | Privilege escalation via access to sensitive labels. | If Substrate offers its own resource labeling mechanism, it must also offer a way to authorize label updates on a per-label basis. | Substrate authorization system requires explicit authorization to update metadata, separate from updating resource body. Substrate authorization system supports per-label authorization rules. | Plenty of attacks in K8s were possible because labels had semantic meaning, but the permission model could implicitly granted access to modify labels, even if it was inappropriate. For example, /status subresource allows label updates. Substrate shouldn't repeat this mistake. | |
There was a problem hiding this comment.
This is sort of presuming what labels are for, but we don't really even have labels yet. I think the point about being more careful is important, though.
| | :---- | :---- | :---- | :---- | :---- | :---- | | ||
| | | Critical | Access to the internal network allows arbitrary actions to be performed on ate-apiserver, atelet, substrate backend database, etc. | All system components must have basic mutual authentication and authorization, and communicate over TLS. All clients (including end users and actors) must be authenticated and authorized. Unauthenticated traffic must be rejected. | mTLS or other secure channel (e.g. UDS) between networked system components (ate-apiserver, atelet, ateom, etc) each atelet has a unique identity cryptographically tied to the node identity ate-router should check client permissions before resuming actors or forwarding traffic to actors. The only component authorized to connect directly to the backend database should be ate-apiserver. | | | ||
| | | High | Privilege escalation via access to sensitive labels. | If Substrate offers its own resource labeling mechanism, it must also offer a way to authorize label updates on a per-label basis. | Substrate authorization system requires explicit authorization to update metadata, separate from updating resource body. Substrate authorization system supports per-label authorization rules. | Plenty of attacks in K8s were possible because labels had semantic meaning, but the permission model could implicitly granted access to modify labels, even if it was inappropriate. For example, /status subresource allows label updates. Substrate shouldn't repeat this mistake. | | ||
| | | High | Attacker gains control of Substrate API server, router, or other ingress/egress proxy. | Isolate the control plane from the data plane, and isolate data plane ingress from sandboxes. | Don't co-locate ate-apiserver or other control-plane components on the same machines as the untrusted sandboxes. Consider running any gateway/router that enables direct interaction with sandboxes on a separate VM from the sandboxes, or using a zero-trust architecture where traffic is encrypted and authenticated end-to-end. | | |
There was a problem hiding this comment.
Don't co-locate ate-apiserver or other control-plane components on the same machines as the untrusted sandboxes.
I think the point here should be clarified. While we trust the sandboxing tech, there's some ADDITIONAL security to be had by segregating. For large users, or users who run truly 3rd part code in sandboxes, the exposure risk of not segregating likely outweighs any potential benefits.
| | | High | Privilege escalation via access to sensitive labels. | If Substrate offers its own resource labeling mechanism, it must also offer a way to authorize label updates on a per-label basis. | Substrate authorization system requires explicit authorization to update metadata, separate from updating resource body. Substrate authorization system supports per-label authorization rules. | Plenty of attacks in K8s were possible because labels had semantic meaning, but the permission model could implicitly granted access to modify labels, even if it was inappropriate. For example, /status subresource allows label updates. Substrate shouldn't repeat this mistake. | | ||
| | | High | Attacker gains control of Substrate API server, router, or other ingress/egress proxy. | Isolate the control plane from the data plane, and isolate data plane ingress from sandboxes. | Don't co-locate ate-apiserver or other control-plane components on the same machines as the untrusted sandboxes. Consider running any gateway/router that enables direct interaction with sandboxes on a separate VM from the sandboxes, or using a zero-trust architecture where traffic is encrypted and authenticated end-to-end. | | | ||
| | | High | Attacker who can create ActorTemplates specifies malicious runtime. | Ensure available runtime can only be configured by administrators. | Consider a mechanism like RuntimeClass to decouple configuration of available runtimes from consumption of available runtimes. | | | ||
| | | High | Attacker who can create ActorTemplates can read or write any storage buckets atelet has access to. | Ensure that bucket access is least-privilege. | Use credentials derived from actor identity to read snapshots. Configure permissions to prevent atelet or nodes from having access to sensitive buckets. | For example: Attacker creates an ActorTemplate with the runsc URL or golden snapshot URL pointing to an arbitrary bucket in the same project/resource scope as the cluster. If atelet has project-wide access to buckets, this could cause the state to be pulled into the worker pod or malicious actor. Similarly, an attacker could set the snapshots URL to point to an internal infrastructure bucket, causing data to be written to that bucket. | |
There was a problem hiding this comment.
This suggests to me that having a snapshot URL is overly-powerful. Why do we need more than an indentifier which atelet can combine with other configuration to produce that URL?
| | :---- | :---- | :---- | :---- | :---- | :---- | | ||
| | | Critical | Access to the internal network allows arbitrary actions to be performed on ate-apiserver, atelet, substrate backend database, etc. | All system components must have basic mutual authentication and authorization, and communicate over TLS. All clients (including end users and actors) must be authenticated and authorized. Unauthenticated traffic must be rejected. | mTLS or other secure channel (e.g. UDS) between networked system components (ate-apiserver, atelet, ateom, etc) each atelet has a unique identity cryptographically tied to the node identity ate-router should check client permissions before resuming actors or forwarding traffic to actors. The only component authorized to connect directly to the backend database should be ate-apiserver. | | | ||
| | | High | Privilege escalation via access to sensitive labels. | If Substrate offers its own resource labeling mechanism, it must also offer a way to authorize label updates on a per-label basis. | Substrate authorization system requires explicit authorization to update metadata, separate from updating resource body. Substrate authorization system supports per-label authorization rules. | Plenty of attacks in K8s were possible because labels had semantic meaning, but the permission model could implicitly granted access to modify labels, even if it was inappropriate. For example, /status subresource allows label updates. Substrate shouldn't repeat this mistake. | | ||
| | | High | Attacker gains control of Substrate API server, router, or other ingress/egress proxy. | Isolate the control plane from the data plane, and isolate data plane ingress from sandboxes. | Don't co-locate ate-apiserver or other control-plane components on the same machines as the untrusted sandboxes. Consider running any gateway/router that enables direct interaction with sandboxes on a separate VM from the sandboxes, or using a zero-trust architecture where traffic is encrypted and authenticated end-to-end. | | |
There was a problem hiding this comment.
This overlaps a bit with "Access to the internal network allows arbitrary actions to be performed on ate-apiserver" ?
| | | High | Improper handling of Secrets | Ensure there is an official, secure, recommended way to pass secret data, like API access tokens, to actors. | Support env and filesystem plumbing for Kubernetes Secrets, to provide an official path that avoids secret material being plumbed via nonspecific fields that are difficult to audit. Ensure secrets are encrypted in transit and ideally stored in memory. If exposed via the filesystem, do so via in-memory tmpfs. | If we don't support this, users will inevitably put secrets in plaintext. | | ||
| | | Medium | Complexity configuring permissions for frameworks on top of Substrate may lead to unintentional privilege escalation. | It must be clear to users what the downstream effects of auth config in substrate are. | If it's not intuitive, it must be documented in a user guide. | AI framework has to set up permissions to access ATE, and to access K8s, and potentially for actors (based on ATE identity and K8s identity) to access the framework. We need to make this easy. Think about past K8s issues like escalate/bind risk. Substrate resource model is spread across ate-apiserver and K8s, increasing complexity and chance for error. | | ||
| | | Medium | Flat namespace of actors encourages broad permission grants or complex graph-oriented policy. | Support a grouping mechanism that can be used in policy controls. | Add namespaces to Substrate, similar to Kubernetes. | | | ||
| | | Medium | DNS misconfiguration | Access to DNS configuration should be limited to authoritative controllers. Routing should use stable configurations and query the API for the current IP before routing each request. | Don't co-locate controllers with access to sensitive system state on the same nodes as actors. Limit permissions to update DNS configuration. Actively query Substrate API to ensure IPs are as up-to-date as possible. Potentially use mTLS based on actor DNS name between ate-router and actors. | As noted above, a flood of requests could create high read load on ate-apiserver. Caching could be considered, but cache invalidation during actor rescheduling would be important to avoid misrouting traffic. Establishing a backend mTLS tunnel between ate-router and each actor based on a serving cert signed for the actor's DNS name could be another approach to avoiding misrouting. | |
There was a problem hiding this comment.
Do we consider DNS info-leaks as a threat? E.g. I can brute-force discover the names of other namespaces and actors by probing DNS

This adds an initial threat model for Substrate. The threat model was originally authored in this Google Doc and shared on the mailing list.
Note that access to the doc is gated by mailing list membership, you must join the list to view it and and individual access requests can't be granted due to a policy restriction. But please comment on this PR at this point if you have feedback :).