Skip to content

Add forwarded_client_claims support to confidential client and managed identity flows#937

Open
Robbie-Microsoft wants to merge 9 commits into
devfrom
rginsburg/client_claims
Open

Add forwarded_client_claims support to confidential client and managed identity flows#937
Robbie-Microsoft wants to merge 9 commits into
devfrom
rginsburg/client_claims

Conversation

@Robbie-Microsoft

@Robbie-Microsoft Robbie-Microsoft commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Port of msal-dotnet PR 5999 (WithClaimsFromClient) into msal-python.

Adds a forwarded_client_claims keyword argument that forwards client-originated claims (claims that originate from the caller rather than from a server-issued challenge) to ESTS / IMDS.

In .NET, WithClaimsFromClient is defined on the confidential-client builder base, so it applies to every confidential client flow. This port matches that scope:

Confidential client — msal/application.py

  • acquire_token_for_client
  • acquire_token_on_behalf_of (OBO)
  • acquire_token_by_user_federated_identity_credential (user FIC)
  • acquire_token_by_authorization_code
  • acquire_token_silent / acquire_token_silent_with_error (cache-read isolation + refresh-token request merge)

Managed identity — msal/managed_identity.py

  • ManagedIdentityClient.acquire_token_for_client (IMDS / Azure VM source only)

Key difference from claims_challenge

Unlike claims_challenge (server-issued, which bypasses the cache), forwarded_client_claims tokens are cached, and the cache entry is keyed on the claims value. This reuses the existing _compute_ext_cache_key cache-isolation mechanism (the same pattern as fmi_path): claims stays excluded from the extended cache key, while the forwarded claims participate in it. Isolation is bidirectional — a request carrying forwarded claims never reads a plain cached token, and a plain request never reads a forwarded-claims token.

Because the value is part of the cache key, callers must send the same forwarded_client_claims value on every request that should share a cached token; omitting or changing it routes to a different cache entry (a cache miss).

Naming

The public keyword is forwarded_client_claims. An earlier revision named it client_claims, which collided with the unrelated ClientApplication.__init__ client_claims parameter — a dict of extra claims signed into the client-assertion JWT. Two reviewers flagged the collision, so the per-request parameter was renamed to forwarded_client_claims (a JSON string of claims forwarded in the token request) to make the two distinct concepts unambiguous. Only the public keyword changed; the internal request-data key ("client_claims") that drives cache keying and the wire-strip is unchanged, so caching and wire behavior are identical.

Changes

Area Change
msal/token_cache.py New _parse_claims_or_raise, _deep_merge_dict, _merge_claims helpers. _parse_claims_or_raise raises a friendly ValueError for both malformed JSON and non-string input (it catches the TypeError from json.loads too). claims stays excluded from the extended cache key; the forwarded claims participate. _compute_ext_cache_key matches MSAL .NET's ComputeAccessTokenExtCacheKey byte-for-byte (sorted, separator-less key+value concatenation → SHA-256 → base64url).
msal/oauth2cli/oauth2.py Strips the cache-key-only client_claims pseudo-parameter from the wire body, while preserving it for the cache-add event.
msal/application.py New shared _stash_client_claims() helper validates forwarded_client_claims (must be a JSON string) and stashes it into the request data (so it contributes to the extended cache key and is merged into the OAuth claims parameter, then stripped from the wire). Used consistently by every confidential client flow above, including acquire_token_for_client. The silent path stashes the value for cache-read isolation and merges it into the refresh-token request; the RT-refresh path pops data once before the candidate-RT loop so the value applies across all candidate RTs. Docstrings cross-reference the constructor client_claims parameter to keep the two distinct.
msal/managed_identity.py MI supports forwarded_client_claims on the IMDS (Azure VM) source only (sent as the claims query parameter); other sources raise a clear error. The source is validated before the cache read so an unsupported source never returns a cached client-claims token. The claims value is forwarded as-is (no client-side key allow-list). Non-string input is rejected with a ValueError, consistent with the confidential-client flows.

Hardening (cross-MSAL review)

Five fixes were applied after comparing the review feedback on the sibling MSAL PRs — go #629, java #1039, js #8686:

  1. No MSIv1 allow-list. Removed the client-side xms_az_nwperimid-only validation; any JSON-object claims value is forwarded as-is and IMDS decides which keys it accepts (matching go/java).
  2. Validate MI source before cache read. Unsupported managed-identity sources (Service Fabric, App Service, Machine Learning, Azure Arc) are rejected up front; _obtain_token keeps its per-source guards as a backstop.
  3. Conflict precedence. Added tests documenting that on a direct leaf conflict between claims_challenge and forwarded_client_claims, the client-originated value wins (merged last), while disjoint claims are preserved.
  4. Generic docs. Dropped the first-party xms_az_nwperimid example from public docstrings in favor of generic "client-originated claims" wording.
  5. Send-on-every-request. Docstrings now state explicitly that the same value must be sent on every request that should share the cached token (it is part of the cache key).

Cache-key encoding. _compute_ext_cache_key intentionally matches MSAL .NET's ComputeAccessTokenExtCacheKey (sorted, separator-less key+value concatenation) for byte-parity with .NET. MSAL Go's CacheExtKeyGenerator later switched to a length-prefixed encoding (go #629) to make it injective and avoid a theoretical component-boundary collision; that change was deliberately not adopted here so msal-python and MSAL .NET stay byte-identical. Caches are not shared across languages at runtime.

Behavior notes

  • The raw claims value is never echoed in validation error messages (it may be sensitive).
  • Different forwarded_client_claims values produce separate cache entries — use stable, non-dynamic values (sent on every request) to avoid unbounded cache growth.
  • forwarded_client_claims merges with capability-derived claims and claims_challenge into a single claims parameter; the client-originated value wins on a direct conflict.
  • A non-string value raises a ValueError on every flow (confidential client and managed identity).
  • No behavior change when the parameter is absent: the merge is a no-op and the wire/cache paths are unaffected.

Testing

Added unit tests covering each flow: wire shape (merged claims present, no client_claims body leak), input validation (non-string / invalid JSON, including non-str types), cache isolation (same value → cache hit; different value or plain request → isolated), the three-way merge of claims_challenge + client capabilities + forwarded_client_claims, conflict precedence, cross-MSAL cache-key byte-parity with MSAL .NET, and the MI source restrictions.

python -m pytest tests/test_token_cache.py tests/test_application.py tests/test_mi.py
# 202 passed

… flows

Port of msal-dotnet PR 5999 (WithClaimsFromClient). Adds a `client_claims`
keyword argument to `ConfidentialClientApplication.acquire_token_for_client`
and `ManagedIdentityClient.acquire_token_for_client` for forwarding
client-originated claims (e.g. the network security perimeter
`xms_az_nwperimid` claim) to ESTS/IMDS.

Unlike `claims_challenge` (server-issued, bypasses the cache), `client_claims`
tokens are cached and the cache entry is keyed on the claims value, reusing the
existing `_compute_ext_cache_key` mechanism (the `fmi_path` precedent).

- token_cache: add `_parse_claims_or_raise`, `_deep_merge_dict`, `_merge_claims`
  helpers; `claims` stays excluded from the ext cache key while `client_claims`
  participates in it.
- oauth2: strip the cache-key-only `client_claims` pseudo-parameter from the
  wire body while preserving it for the cache-add event.
- application: validate `client_claims`, merge it into the OAuth `claims`
  body parameter, and isolate the cache by claims value.
- managed_identity: support `client_claims` on the IMDS (Azure VM) source only
  (sent as the `claims` query parameter); other sources raise; MSIv1 restricts
  the claims JSON to only the `xms_az_nwperimid` key.

Adds unit tests covering cache isolation, wire shape, claim merging, source
restrictions, and MSIv1 validation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 26, 2026 18:24
@Robbie-Microsoft Robbie-Microsoft requested a review from a team as a code owner June 26, 2026 18:24

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for forwarding client-originated claims via a new client_claims argument in confidential-client and managed-identity client-credentials flows, while ensuring tokens remain cacheable and isolated by the claims value (distinct from server-issued claims_challenge behavior).

Changes:

  • Introduces shared claims parsing/merging helpers and uses client_claims to participate in the extended cache key (while keeping claims excluded).
  • Ensures client_claims is merged into the OAuth claims parameter but stripped from the actual HTTP request body (cache-key-only pseudo-parameter).
  • Adds managed identity support for client_claims on IMDS/Azure VM only (including MSIv1 validation), with unit tests covering caching and wire-shape behaviors.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
msal/token_cache.py Adds claims helpers and leverages existing ext-cache-key mechanism to isolate cached tokens by client_claims.
msal/oauth2cli/oauth2.py Prevents client_claims from being sent on the wire while preserving it for cache add events.
msal/application.py Adds client_claims to CCA acquire-token-for-client flow, validates/merges claims, and isolates cache entries by claims value.
msal/managed_identity.py Adds client_claims support for IMDS/Azure VM only, isolates cache by claims, and validates MSIv1 claim constraints.
tests/test_token_cache.py Adds unit tests for cache-key isolation and claims helper behavior.
tests/test_mi.py Adds managed-identity unit tests for IMDS forwarding, cache isolation, and unsupported source errors.
tests/test_application.py Adds CCA unit tests for wire shape, merging behavior, and cache isolation with client_claims.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread msal/managed_identity.py Outdated
Comment thread msal/application.py Outdated
… code, silent)

Phase 1 added client_claims to acquire_token_for_client. This extends it to the
remaining confidential client flows so client-originated claims are forwarded
and cache-isolated consistently, mirroring msal-dotnet PR 5999's
WithClaimsFromClient (which applies to all confidential client builders):

- acquire_token_on_behalf_of (OBO)
- acquire_token_by_user_federated_identity_credential (FIC)
- acquire_token_by_authorization_code
- acquire_token_silent / acquire_token_silent_with_error (cache-read isolation
  plus refresh-token request merge)

A shared _stash_client_claims() helper validates the value and stashes it into
the request data, so it (a) contributes to the extended cache key and (b) is
merged into the OAuth "claims" parameter while being stripped from the wire
body. Adds unit tests for each flow (wire shape, validation, cache isolation).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Comment thread msal/application.py Outdated
Comment thread msal/application.py Outdated
Comment thread msal/managed_identity.py Outdated
Comment thread msal/token_cache.py
- token_cache._parse_claims_or_raise now also catches TypeError (raised when
  the input is not a str/bytes) and surfaces the same friendly ValueError, so
  every caller behaves consistently regardless of the bad input's type.
- ManagedIdentityClient.acquire_token_for_client now rejects non-string
  client_claims with a ValueError (mirroring the confidential-client flows),
  preventing a raw TypeError leak and inconsistent extended-cache-key hashing.
- ConfidentialClientApplication.acquire_token_for_client now reuses the shared
  _stash_client_claims() helper instead of duplicating the validate-and-stash
  logic, removing the risk of the two paths diverging.
- Add cross-referencing docstring notes disambiguating the per-request
  client_claims (a JSON string forwarded in the request) from the pre-existing
  constructor client_claims (a dict of claims signed into the client-assertion
  JWT).
- Add unit tests for non-string client_claims on managed identity and for
  non-string inputs to _parse_claims_or_raise.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread msal/application.py Outdated
Comment thread tests/test_application.py
Address reviewer feedback on PR #937:

- Rename the new per-request parameter `client_claims` ->
  `forwarded_client_claims` across all confidential-client flows
  (acquire_token_for_client, on_behalf_of, user FIC, auth code, silent)
  and the Managed Identity acquire_token_for_client. This removes the
  naming collision with the pre-existing `client_claims` *constructor*
  parameter (a dict signed into the client-assertion JWT), which a second
  reviewer also flagged as confusing.

  The public keyword is the only thing renamed. The internal request-data
  key "client_claims" (used by the oauth2 wire-strip, _compute_ext_cache_key
  cache isolation, and _merge_claims) and the private Managed Identity
  plumbing keep their existing names, so cache keying and wire behavior are
  unchanged.

- Add test_forwarded_client_claims_merged_with_claims_challenge, covering
  the previously untested three-way merge of server-issued claims_challenge
  + client capabilities + forwarded_client_claims into the single OAuth
  "claims" wire parameter.

200 passed across test_token_cache.py, test_mi.py, test_application.py.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 30, 2026 21:47

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment thread msal/application.py
"""

def acquire_token_for_client(self, scopes, claims_challenge=None, fmi_path=None, **kwargs):
def acquire_token_for_client(self, scopes, claims_challenge=None, fmi_path=None, forwarded_client_claims=None, **kwargs):

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligned — renamed the PR title to use forwarded_client_claims (the description was already updated last round). Parameter, title, and docs are now consistent.

Comment thread msal/application.py
Comment on lines 1272 to 1275
nonce=None,
claims_challenge=None,
forwarded_client_claims=None,
**kwargs):

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By design. It sits directly beside claims_challenge on the same base acquire_token_by_authorization_code, so it mirrors exactly how claims_challenge is already exposed on public clients via inheritance; a hard public-client guard would be inconsistent with that existing pattern. Adding forwarded_client_claims to the recommended acquire_token_by_auth_code_flow is a reasonable follow-up, but is left out of this .NET-parity port for now.

Robbie-Microsoft and others added 2 commits June 30, 2026 20:00
Apply six fixes derived from the sibling MSAL PRs (go #629, java #1039,
js #8686) to the forwarded_client_claims port:

1. Make _compute_ext_cache_key injective. Switch from separator-less
   key+value concatenation to length-prefixed pairs ("{len(k)}:{k}{len(v)}:{v}"),
   matching Go's post-collision-fix CacheExtKeyGenerator. Without this,
   fmi_path + client_claims (which now co-occur in acquire_token_for_client)
   could collide and return the wrong cached token. Adds boundary-collision
   regression tests. NOTE: hashes are now intentionally not byte-identical to
   MSAL .NET (which still uses unprefixed concat); caches are not shared across
   languages, so within-process injectivity is what matters.

2. Remove the MSIv1 client-side allow-list (_validate_msiv1_claims). Forward
   any JSON-object claims value as-is and let IMDS decide which keys it accepts,
   matching go/java.

3. Validate the managed-identity source before the cache read. Reject
   unsupported sources (Service Fabric, App Service, Machine Learning, Azure Arc)
   up front so an unsupported source never returns a cached client-claims token.
   _obtain_token keeps its per-source guards as a backstop.

4. Add merge-conflict precedence tests: on a direct leaf conflict the
   client-originated value wins (merged last); disjoint claims are preserved.

5. Drop the first-party xms_az_nwperimid example from public docstrings; use
   generic "client-originated claims" wording.

6. Document that the same forwarded_client_claims value must be sent on every
   request that should share the cached token (it is part of the cache key).

204 tests pass across test_token_cache.py, test_mi.py, test_application.py.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore _compute_ext_cache_key to MSAL .NET's ComputeAccessTokenExtCacheKey
encoding: sorted, separator-less key+value concatenation -> SHA-256 -> base64url.
This makes the ext cache key byte-identical to MSAL .NET again.

The earlier hardening commit (4fc3639) had switched to Go's post-#629
length-prefixed encoding to make the key injective. Per maintainer decision,
msal-python should match MSAL .NET, not current Go, so that change is reverted:

- token_cache.py: restore plain key+value concatenation; docstring now notes the
  .NET match and the deliberate divergence from Go's #629 length-prefixed form.
- test_token_cache.py: restore the .NET parity hashes (bns2ytmx..., 3-rg6_wy...,
  rn_gkpxx...) and rename the parity tests to *_matches_dotnet; remove the two
  length-prefix boundary-collision regression tests (they asserted the injective
  property that .NET's encoding does not provide).

Hardening fixes #2-#6 (MI allow-list removal, MI source pre-validation, merge
conflict-precedence tests, generic docs, send-on-every-request docs) are
unchanged. 202 tests pass across test_token_cache.py, test_mi.py,
test_application.py.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 1, 2026 16:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment thread msal/application.py
Comment thread msal/application.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment thread msal/oauth2cli/oauth2.py
Comment thread msal/managed_identity.py
Use kwargs.get("data", {}) instead of the truthiness form in acquire_token_silent and acquire_token_silent_with_error so a caller-provided empty mapping is not replaced with a fresh dict. Matches the fmi_path / for_client sibling sites. Per Copilot review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Robbie-Microsoft Robbie-Microsoft changed the title Add client_claims support to confidential client and managed identity flows Add forwarded_client_claims support to confidential client and managed identity flows Jul 1, 2026
The forwarded client_claims value (and the merged "claims") are kept in the cache-add event data only for ext_cache_key computation, but TokenCache.add() debug-logs event["data"] and previously masked only password/client_secret/refresh_token/assertion/user_fic. Add "client_claims" and "claims" to the masked fields so DEBUG logging cannot emit raw claim contents. Per Copilot review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 1, 2026 17:06

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Comment thread msal/token_cache.py
…te_ext_cache_key

Clarify that pseudo-parameters like client_claims intentionally feed the extended cache key hash while being stripped from the wire, so different client-originated claims route to separate cache entries. Doc-only. Per Copilot review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants