Add forwarded_client_claims support to confidential client and managed identity flows#937
Add forwarded_client_claims support to confidential client and managed identity flows#937Robbie-Microsoft wants to merge 9 commits into
Conversation
… flows Port of msal-dotnet PR 5999 (WithClaimsFromClient). Adds a `client_claims` keyword argument to `ConfidentialClientApplication.acquire_token_for_client` and `ManagedIdentityClient.acquire_token_for_client` for forwarding client-originated claims (e.g. the network security perimeter `xms_az_nwperimid` claim) to ESTS/IMDS. Unlike `claims_challenge` (server-issued, bypasses the cache), `client_claims` tokens are cached and the cache entry is keyed on the claims value, reusing the existing `_compute_ext_cache_key` mechanism (the `fmi_path` precedent). - token_cache: add `_parse_claims_or_raise`, `_deep_merge_dict`, `_merge_claims` helpers; `claims` stays excluded from the ext cache key while `client_claims` participates in it. - oauth2: strip the cache-key-only `client_claims` pseudo-parameter from the wire body while preserving it for the cache-add event. - application: validate `client_claims`, merge it into the OAuth `claims` body parameter, and isolate the cache by claims value. - managed_identity: support `client_claims` on the IMDS (Azure VM) source only (sent as the `claims` query parameter); other sources raise; MSIv1 restricts the claims JSON to only the `xms_az_nwperimid` key. Adds unit tests covering cache isolation, wire shape, claim merging, source restrictions, and MSIv1 validation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds support for forwarding client-originated claims via a new client_claims argument in confidential-client and managed-identity client-credentials flows, while ensuring tokens remain cacheable and isolated by the claims value (distinct from server-issued claims_challenge behavior).
Changes:
- Introduces shared claims parsing/merging helpers and uses
client_claimsto participate in the extended cache key (while keepingclaimsexcluded). - Ensures
client_claimsis merged into the OAuthclaimsparameter but stripped from the actual HTTP request body (cache-key-only pseudo-parameter). - Adds managed identity support for
client_claimson IMDS/Azure VM only (including MSIv1 validation), with unit tests covering caching and wire-shape behaviors.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
msal/token_cache.py |
Adds claims helpers and leverages existing ext-cache-key mechanism to isolate cached tokens by client_claims. |
msal/oauth2cli/oauth2.py |
Prevents client_claims from being sent on the wire while preserving it for cache add events. |
msal/application.py |
Adds client_claims to CCA acquire-token-for-client flow, validates/merges claims, and isolates cache entries by claims value. |
msal/managed_identity.py |
Adds client_claims support for IMDS/Azure VM only, isolates cache by claims, and validates MSIv1 claim constraints. |
tests/test_token_cache.py |
Adds unit tests for cache-key isolation and claims helper behavior. |
tests/test_mi.py |
Adds managed-identity unit tests for IMDS forwarding, cache isolation, and unsupported source errors. |
tests/test_application.py |
Adds CCA unit tests for wire shape, merging behavior, and cache isolation with client_claims. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… code, silent) Phase 1 added client_claims to acquire_token_for_client. This extends it to the remaining confidential client flows so client-originated claims are forwarded and cache-isolated consistently, mirroring msal-dotnet PR 5999's WithClaimsFromClient (which applies to all confidential client builders): - acquire_token_on_behalf_of (OBO) - acquire_token_by_user_federated_identity_credential (FIC) - acquire_token_by_authorization_code - acquire_token_silent / acquire_token_silent_with_error (cache-read isolation plus refresh-token request merge) A shared _stash_client_claims() helper validates the value and stashes it into the request data, so it (a) contributes to the extended cache key and (b) is merged into the OAuth "claims" parameter while being stripped from the wire body. Adds unit tests for each flow (wire shape, validation, cache isolation). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- token_cache._parse_claims_or_raise now also catches TypeError (raised when the input is not a str/bytes) and surfaces the same friendly ValueError, so every caller behaves consistently regardless of the bad input's type. - ManagedIdentityClient.acquire_token_for_client now rejects non-string client_claims with a ValueError (mirroring the confidential-client flows), preventing a raw TypeError leak and inconsistent extended-cache-key hashing. - ConfidentialClientApplication.acquire_token_for_client now reuses the shared _stash_client_claims() helper instead of duplicating the validate-and-stash logic, removing the risk of the two paths diverging. - Add cross-referencing docstring notes disambiguating the per-request client_claims (a JSON string forwarded in the request) from the pre-existing constructor client_claims (a dict of claims signed into the client-assertion JWT). - Add unit tests for non-string client_claims on managed identity and for non-string inputs to _parse_claims_or_raise. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address reviewer feedback on PR #937: - Rename the new per-request parameter `client_claims` -> `forwarded_client_claims` across all confidential-client flows (acquire_token_for_client, on_behalf_of, user FIC, auth code, silent) and the Managed Identity acquire_token_for_client. This removes the naming collision with the pre-existing `client_claims` *constructor* parameter (a dict signed into the client-assertion JWT), which a second reviewer also flagged as confusing. The public keyword is the only thing renamed. The internal request-data key "client_claims" (used by the oauth2 wire-strip, _compute_ext_cache_key cache isolation, and _merge_claims) and the private Managed Identity plumbing keep their existing names, so cache keying and wire behavior are unchanged. - Add test_forwarded_client_claims_merged_with_claims_challenge, covering the previously untested three-way merge of server-issued claims_challenge + client capabilities + forwarded_client_claims into the single OAuth "claims" wire parameter. 200 passed across test_token_cache.py, test_mi.py, test_application.py. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| """ | ||
|
|
||
| def acquire_token_for_client(self, scopes, claims_challenge=None, fmi_path=None, **kwargs): | ||
| def acquire_token_for_client(self, scopes, claims_challenge=None, fmi_path=None, forwarded_client_claims=None, **kwargs): |
There was a problem hiding this comment.
Aligned — renamed the PR title to use forwarded_client_claims (the description was already updated last round). Parameter, title, and docs are now consistent.
| nonce=None, | ||
| claims_challenge=None, | ||
| forwarded_client_claims=None, | ||
| **kwargs): |
There was a problem hiding this comment.
By design. It sits directly beside claims_challenge on the same base acquire_token_by_authorization_code, so it mirrors exactly how claims_challenge is already exposed on public clients via inheritance; a hard public-client guard would be inconsistent with that existing pattern. Adding forwarded_client_claims to the recommended acquire_token_by_auth_code_flow is a reasonable follow-up, but is left out of this .NET-parity port for now.
Apply six fixes derived from the sibling MSAL PRs (go #629, java #1039, js #8686) to the forwarded_client_claims port: 1. Make _compute_ext_cache_key injective. Switch from separator-less key+value concatenation to length-prefixed pairs ("{len(k)}:{k}{len(v)}:{v}"), matching Go's post-collision-fix CacheExtKeyGenerator. Without this, fmi_path + client_claims (which now co-occur in acquire_token_for_client) could collide and return the wrong cached token. Adds boundary-collision regression tests. NOTE: hashes are now intentionally not byte-identical to MSAL .NET (which still uses unprefixed concat); caches are not shared across languages, so within-process injectivity is what matters. 2. Remove the MSIv1 client-side allow-list (_validate_msiv1_claims). Forward any JSON-object claims value as-is and let IMDS decide which keys it accepts, matching go/java. 3. Validate the managed-identity source before the cache read. Reject unsupported sources (Service Fabric, App Service, Machine Learning, Azure Arc) up front so an unsupported source never returns a cached client-claims token. _obtain_token keeps its per-source guards as a backstop. 4. Add merge-conflict precedence tests: on a direct leaf conflict the client-originated value wins (merged last); disjoint claims are preserved. 5. Drop the first-party xms_az_nwperimid example from public docstrings; use generic "client-originated claims" wording. 6. Document that the same forwarded_client_claims value must be sent on every request that should share the cached token (it is part of the cache key). 204 tests pass across test_token_cache.py, test_mi.py, test_application.py. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore _compute_ext_cache_key to MSAL .NET's ComputeAccessTokenExtCacheKey encoding: sorted, separator-less key+value concatenation -> SHA-256 -> base64url. This makes the ext cache key byte-identical to MSAL .NET again. The earlier hardening commit (4fc3639) had switched to Go's post-#629 length-prefixed encoding to make the key injective. Per maintainer decision, msal-python should match MSAL .NET, not current Go, so that change is reverted: - token_cache.py: restore plain key+value concatenation; docstring now notes the .NET match and the deliberate divergence from Go's #629 length-prefixed form. - test_token_cache.py: restore the .NET parity hashes (bns2ytmx..., 3-rg6_wy..., rn_gkpxx...) and rename the parity tests to *_matches_dotnet; remove the two length-prefix boundary-collision regression tests (they asserted the injective property that .NET's encoding does not provide). Hardening fixes #2-#6 (MI allow-list removal, MI source pre-validation, merge conflict-precedence tests, generic docs, send-on-every-request docs) are unchanged. 202 tests pass across test_token_cache.py, test_mi.py, test_application.py. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use kwargs.get("data", {}) instead of the truthiness form in acquire_token_silent and acquire_token_silent_with_error so a caller-provided empty mapping is not replaced with a fresh dict. Matches the fmi_path / for_client sibling sites. Per Copilot review.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The forwarded client_claims value (and the merged "claims") are kept in the cache-add event data only for ext_cache_key computation, but TokenCache.add() debug-logs event["data"] and previously masked only password/client_secret/refresh_token/assertion/user_fic. Add "client_claims" and "claims" to the masked fields so DEBUG logging cannot emit raw claim contents. Per Copilot review. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…te_ext_cache_key Clarify that pseudo-parameters like client_claims intentionally feed the extended cache key hash while being stripped from the wire, so different client-originated claims route to separate cache entries. Doc-only. Per Copilot review. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Port of msal-dotnet PR 5999 (
WithClaimsFromClient) into msal-python.Adds a
forwarded_client_claimskeyword argument that forwards client-originated claims (claims that originate from the caller rather than from a server-issued challenge) to ESTS / IMDS.In .NET,
WithClaimsFromClientis defined on the confidential-client builder base, so it applies to every confidential client flow. This port matches that scope:Confidential client —
msal/application.pyacquire_token_for_clientacquire_token_on_behalf_of(OBO)acquire_token_by_user_federated_identity_credential(user FIC)acquire_token_by_authorization_codeacquire_token_silent/acquire_token_silent_with_error(cache-read isolation + refresh-token request merge)Managed identity —
msal/managed_identity.pyManagedIdentityClient.acquire_token_for_client(IMDS / Azure VM source only)Key difference from
claims_challengeUnlike
claims_challenge(server-issued, which bypasses the cache),forwarded_client_claimstokens are cached, and the cache entry is keyed on the claims value. This reuses the existing_compute_ext_cache_keycache-isolation mechanism (the same pattern asfmi_path):claimsstays excluded from the extended cache key, while the forwarded claims participate in it. Isolation is bidirectional — a request carrying forwarded claims never reads a plain cached token, and a plain request never reads a forwarded-claims token.Because the value is part of the cache key, callers must send the same
forwarded_client_claimsvalue on every request that should share a cached token; omitting or changing it routes to a different cache entry (a cache miss).Naming
The public keyword is
forwarded_client_claims. An earlier revision named itclient_claims, which collided with the unrelatedClientApplication.__init__client_claimsparameter — a dict of extra claims signed into the client-assertion JWT. Two reviewers flagged the collision, so the per-request parameter was renamed toforwarded_client_claims(a JSON string of claims forwarded in the token request) to make the two distinct concepts unambiguous. Only the public keyword changed; the internal request-data key ("client_claims") that drives cache keying and the wire-strip is unchanged, so caching and wire behavior are identical.Changes
msal/token_cache.py_parse_claims_or_raise,_deep_merge_dict,_merge_claimshelpers._parse_claims_or_raiseraises a friendlyValueErrorfor both malformed JSON and non-string input (it catches theTypeErrorfromjson.loadstoo).claimsstays excluded from the extended cache key; the forwarded claims participate._compute_ext_cache_keymatches MSAL .NET'sComputeAccessTokenExtCacheKeybyte-for-byte (sorted, separator-lesskey+valueconcatenation → SHA-256 → base64url).msal/oauth2cli/oauth2.pyclient_claimspseudo-parameter from the wire body, while preserving it for the cache-add event.msal/application.py_stash_client_claims()helper validatesforwarded_client_claims(must be a JSON string) and stashes it into the requestdata(so it contributes to the extended cache key and is merged into the OAuthclaimsparameter, then stripped from the wire). Used consistently by every confidential client flow above, includingacquire_token_for_client. The silent path stashes the value for cache-read isolation and merges it into the refresh-token request; the RT-refresh path popsdataonce before the candidate-RT loop so the value applies across all candidate RTs. Docstrings cross-reference the constructorclient_claimsparameter to keep the two distinct.msal/managed_identity.pyforwarded_client_claimson the IMDS (Azure VM) source only (sent as theclaimsquery parameter); other sources raise a clear error. The source is validated before the cache read so an unsupported source never returns a cached client-claims token. The claims value is forwarded as-is (no client-side key allow-list). Non-string input is rejected with aValueError, consistent with the confidential-client flows.Hardening (cross-MSAL review)
Five fixes were applied after comparing the review feedback on the sibling MSAL PRs — go #629, java #1039, js #8686:
xms_az_nwperimid-only validation; any JSON-object claims value is forwarded as-is and IMDS decides which keys it accepts (matching go/java)._obtain_tokenkeeps its per-source guards as a backstop.claims_challengeandforwarded_client_claims, the client-originated value wins (merged last), while disjoint claims are preserved.xms_az_nwperimidexample from public docstrings in favor of generic "client-originated claims" wording.Cache-key encoding.
_compute_ext_cache_keyintentionally matches MSAL .NET'sComputeAccessTokenExtCacheKey(sorted, separator-lesskey+valueconcatenation) for byte-parity with .NET. MSAL Go'sCacheExtKeyGeneratorlater switched to a length-prefixed encoding (go #629) to make it injective and avoid a theoretical component-boundary collision; that change was deliberately not adopted here so msal-python and MSAL .NET stay byte-identical. Caches are not shared across languages at runtime.Behavior notes
forwarded_client_claimsvalues produce separate cache entries — use stable, non-dynamic values (sent on every request) to avoid unbounded cache growth.forwarded_client_claimsmerges with capability-derived claims andclaims_challengeinto a singleclaimsparameter; the client-originated value wins on a direct conflict.ValueErroron every flow (confidential client and managed identity).Testing
Added unit tests covering each flow: wire shape (merged
claimspresent, noclient_claimsbody leak), input validation (non-string / invalid JSON, including non-strtypes), cache isolation (same value → cache hit; different value or plain request → isolated), the three-way merge ofclaims_challenge+ client capabilities +forwarded_client_claims, conflict precedence, cross-MSAL cache-key byte-parity with MSAL .NET, and the MI source restrictions.