fix(litellm): store bookkeeping span off-band, not in forwarded metadata#6598
fix(litellm): store bookkeeping span off-band, not in forwarded metadata#6598jgreer013 wants to merge 1 commit into
Conversation
6fb4fa5 to
5a5f39b
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5a5f39b. Configure here.
5a5f39b to
986f979
Compare
| # it. The terminal success/failure callback removes the entry; the registry is | ||
| # capped (oldest evicted first) so calls abandoned before a terminal callback | ||
| # fires -- e.g. a stream the caller stops iterating -- cannot grow it unbounded. | ||
| _MAX_TRACKED_SPANS = 1000 |
There was a problem hiding this comment.
Thanks for the PR!
I'll think about approaches because the _spans_by_call will grow if spans aren't finished correctly and everything is dropped if _MAX_TRACKED_SPANS is reached.
Generally, if there's a public shared reference between input and output callbacks, we should use that. If not, your approach could be the best we can do.
There was a problem hiding this comment.
It seems like there isn't one in this case, but perhaps there's an opportunity to add one with a bigger rework?
There was a problem hiding this comment.
Actually, turns out there is one — litellm threads the same model_call_details dict through all three callbacks, so I moved the span onto that (off the forwarded metadata path) instead of the module-level registry. No cap or eviction now, and the span's lifetime is tied to the request, like the clickhouse/dramatiq integrations do it. I double-checked the Anthropic passthrough doesn't forward it — the request body only gets the recognized params, not model_call_details. Lmk if you had a different reference in mind!
With LiteLLMIntegration enabled, any call passing caller `metadata` crashed during request serialization. `_input_callback` stored the live Span in the caller's `metadata` dict, and some providers (e.g. Anthropic's /v1/messages passthrough) forward that dict into the outbound request body, so `json.dumps(request_body)` raised `TypeError: Object of type Span is not JSON serializable` before the request was sent. The span (holding the verbatim prompt under send_default_pii) could also leak to the provider. Stash the span on a top-level key of the per-request kwargs dict (litellm's `model_call_details`) that litellm threads through the input/success/failure callbacks, instead of in the forwarded `metadata` sub-dict. This ties the span's lifetime to the request with no module-level tracking, mirroring how the clickhouse/dramatiq integrations stash a span on their per-request object. The Anthropic request body is built only from recognized request params, not from `model_call_details`, so the span is never serialized onto the wire (verified end-to-end against the passthrough). Fixes getsentry#6596 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
986f979 to
dbb2139
Compare

Summary
Fixes #6596 (Linear PY-2540).
With
LiteLLMIntegrationenabled, any call that passes callermetadatacould crash during request serialization:Root cause
_input_callbackstored the live span via_get_metadata_dict(kwargs)["_sentry_span"] = span. litellm threads the caller'smetadatadict through tolitellm_params["metadata"]— it's the same dict object the caller passed — and some providers (e.g. Anthropic's/v1/messagespassthrough) forward that dict into the outbound request body. So the liveSpanlanded atrequest_body["metadata"]["_sentry_span"]andjson.dumps(request_body)failed before the request was sent.Separately, under
send_default_pii=True+include_prompts=True, that span'sgen_ai.request.messagesholds the verbatim prompt, so any sink that serialized the injected metadata could leak prompt content to the provider.Fix
Store the bookkeeping span off-band in a module-level registry keyed by
litellm_call_id(a per-request UUID that is stable across the input/success/failure callbacks), falling back to the identity of the shared callbackkwargsdict for direct callback invocations that omit it. The span no longer lives in any litellm-visible dict, so it can't be forwarded, serialized, or deep-copied by litellm — fixing both the crash and the prompt-leak vector. The registry entry is removed by the terminal success/failure callback (streaming success peeks and pops only on the final call).Testing
test_caller_metadata_stays_json_serializableasserts the forwardedmetadatastays free of_sentry_spanand JSON-serializable at the serialization moment, while the span is still recorded off-band. It fails onmasterand passes with this change.mypyclean;ruffcheck + format clean.Related (downstream, defense-in-depth)
litellm forwards/serializes this injected metadata into the provider body. Companion issue: BerriAI/litellm#30662. litellm has stripped such span objects in other logging paths (BerriAI/litellm#15728, BerriAI/litellm#12354).