-
Notifications
You must be signed in to change notification settings - Fork 155
fix: LiteLLM cache key collision and silent fallback to non-existent default model #647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
awschmeder
wants to merge
9
commits into
Zoo-Code-Org:main
Choose a base branch
from
awschmeder:fix/litellm-model-desync
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
4203a2d
fix: LiteLLM provider model desync after Sync Models or settings navi…
awschmeder fd44f2a
fix: address CodeRabbit review comments on litellm model desync
awschmeder c505a6b
chore: remove ineffective codeql suppression directives
awschmeder bf83be7
test(webviewMessageHandler): correct LiteLLM credential priority asse…
awschmeder a28a834
fix(security): derive cache-key discriminator without flagged API-key…
awschmeder da41468
fix(security): derive cache filename digest via KDF to clear residual…
awschmeder 5a2a2d8
test: cover all getCacheKey scoping branches in modelCache
awschmeder 2f37af8
fix: resolve check-types errors in modelCache tests
awschmeder 58ec401
fix: address PR review for LiteLLM cache key collision fix
awschmeder File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| --- | ||
| "zoo-code": patch | ||
| --- | ||
|
|
||
| Fix LiteLLM provider cache key collision, credential priority, and model-selection fallback to non-existent default. | ||
|
|
||
| Two bugs are addressed: | ||
|
|
||
| 1. **Cache key collision**: All URL-scoped providers (LiteLLM, Ollama, LM Studio, Poe, DeepSeek, | ||
| Requesty) previously shared one cache entry keyed only on the provider name. Switching between | ||
| profiles backed by different servers silently served the wrong model list and the stale list | ||
| persisted across VS Code restarts via the disk cache. Fixed with a compound cache key: | ||
| URL-scoped providers use `provider:baseUrl`; key-scoped providers (LiteLLM, Poe, Requesty) | ||
| additionally include a short, irreversible discriminator derived from the API key | ||
| (`provider:baseUrl:<discriminator>`) so that two different API keys on the same server never share | ||
| a cache entry (relevant when the server enforces per-key model allowlists). Both the discriminator | ||
| and the on-disk filename digest are derived via truncated PBKDF2 so neither can be reversed to | ||
| identify the API key written to the cache filename. The `RouterProvider.getModel()` cold-start | ||
| fallback is also corrected to pass the full options so it resolves the same compound key. | ||
|
|
||
| 2. **Silent fallback to hardcoded default**: When the LiteLLM model list was empty (due to the | ||
| collision above, a failed sync, or a transient error), `useSelectedModel` reset the configured | ||
| model ID to `claude-3-7-sonnet-20250219` -- a model that typically does not exist on user | ||
| LiteLLM servers. Four sub-fixes: preserve the configured model ID when the list is empty; | ||
| invalidate the React Query router-models cache after a successful "Sync Models" click; pass the | ||
| current LiteLLM credentials in the debounced `requestRouterModels` message; and correct the | ||
| credential priority in `webviewMessageHandler.ts` so that message values (current unsaved field | ||
| state) take precedence over stale saved config, matching the pattern already used for DeepSeek. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The in-flight dedup test uses
openrouter, whose cache key is the bare provider name. Could a companion test with a key-scoped provider (e.g.requesty) verify that two concurrent calls with different API keys each trigger their own fetch, while two calls with the same key share a single in-flight promise?