Gemini: Toggleable Inference Routing #982
Conversation
📝 WalkthroughWalkthroughAdds Gemini route configuration, expands Google provider naming and routing, changes Vertex host construction, updates credential fallback handling, and adds tests plus routing documentation. ChangesGemini provider routing update
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related issues
Possibly related PRs
Suggested labels: Suggested reviewers: 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
OpenAPI changes 🟢 8 non-breaking changesTip Safe to merge from an API-contract perspective. Full changelog ·
|
| Method | Path | Change | |
|---|---|---|---|
| 🟢 | POST |
/api/v1/configs |
added the new google-vertex-native enum value to the request property config_blob/completion/anyOf[subschema #1: NativeCompletionConfig]/provider |
| 🟢 | POST |
/api/v1/llm/call |
added the new google-vertex-native enum value to the request property config/blob/anyOf[subschema #1: ConfigBlob]/completion/anyOf[subschema #1: NativeCompletionConfig]/provider |
| 🟢 | POST |
/api/v1/llm/chain |
added the new google-vertex-native enum value to the request property blocks/items/config/blob/anyOf[subschema #1: ConfigBlob]/completion/anyOf[subschema #1: NativeCompletionConfig]/provider |
| 🟢 | GET |
/api/v1/models |
added the enum value google-vertex to the property anyOf[subschema #1: Provider]/ of the query request parameter provider |
| 🟢 | PATCH |
/api/v1/models |
added the new google-vertex enum value to the request property items/provider |
| 🟢 | DELETE |
/api/v1/models/{provider}/{model_name} |
added the new enum value google-vertex to the path request parameter provider |
| 🟢 | GET |
/api/v1/models/{provider}/{model_name} |
added the new enum value google-vertex to the path request parameter provider |
| 🟢 | PATCH |
/api/v1/models/{provider}/{model_name} |
added the new enum value google-vertex to the path request parameter provider |
main ↔ 16639921 · generated by oasdiff
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/services/llm/providers/google_ai.py (1)
96-102: 🎯 Functional Correctness | 🔴 Critical | ⚡ Quick winCritical:
globallocation produces an invalid host and a 404.This unconditionally prefixes the location, so for
location="global"the host becomesglobal-aiplatform.googleapis.com, which does not resolve and returns 404. The accompanying testtest_global_location_uses_bare_hostinbackend/app/tests/services/llm/providers/test_google_ai.pyexplicitly asserts thegloballocation must use the bareaiplatform.googleapis.comhost (and that"global-aiplatform"must not appear), noting it caught a real outage. The current code will fail that test and breakglobal-configured projects.🐛 Proposed fix
- host = f"{self.location}-aiplatform.googleapis.com" + host = ( + "aiplatform.googleapis.com" + if self.location == "global" + else f"{self.location}-aiplatform.googleapis.com" + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/services/llm/providers/google_ai.py` around lines 96 - 102, The Google AI provider endpoint builder in endpoint currently always prefixes self.location into the host, which breaks the global case. Update endpoint(model) so that when location is "global" it uses the bare aiplatform.googleapis.com host, and only prefixes non-global locations; keep the rest of the URL construction unchanged. This should align with test_google_ai.py’s test_global_location_uses_bare_host and avoid generating any host containing "global-aiplatform".
🧹 Nitpick comments (1)
backend/app/core/config.py (1)
118-118: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsider constraining the type to a
Literalto validate at config-load time.
GEMINI_DEFAULT_INFERENCE_ROUTEis a free-formstr, so an invalid value is only caught later inregistry.get_provider_class(), and the allowed values"vertex"/"aistudio"are magic strings duplicated acrossconfig.pyandregistry.py. Mirroring theENVIRONMENTfield (Line 42) with aLiteral(or sharedEnum) lets pydantic reject bad values at startup and removes the duplicated literals.♻️ Proposed change
- GEMINI_DEFAULT_INFERENCE_ROUTE: str = "aistudio" + GEMINI_DEFAULT_INFERENCE_ROUTE: Literal["vertex", "aistudio"] = "aistudio"As per coding guidelines: "Do not use magic values in code; extract repeated literals into constants,
Enummembers, or settings."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/core/config.py` at line 118, The GEMINI_DEFAULT_INFERENCE_ROUTE setting is currently a free-form string, so invalid routes are only discovered later in registry.get_provider_class and the allowed values are duplicated as magic strings. Update the configuration field in config.py to use a Literal or shared Enum, mirroring the ENVIRONMENT setting style, and make registry.get_provider_class consume the same shared source of truth so startup validation rejects bad values and the route names are defined only once.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/services/llm/providers/registry.py`:
- Around line 54-61: The provider routing check in the registry branch for
Google providers uses duplicated magic strings and a malformed error message.
Update the validation in the registry logic to rely on the shared setting
type/constant from config.py (for example the Literal/Enum-backed definition)
instead of hardcoding "vertex" and "aistudio", and rewrite the ValueError text
so it reads cleanly with proper spacing and a single source of truth for the
allowed values.
---
Outside diff comments:
In `@backend/app/services/llm/providers/google_ai.py`:
- Around line 96-102: The Google AI provider endpoint builder in endpoint
currently always prefixes self.location into the host, which breaks the global
case. Update endpoint(model) so that when location is "global" it uses the bare
aiplatform.googleapis.com host, and only prefixes non-global locations; keep the
rest of the URL construction unchanged. This should align with
test_google_ai.py’s test_global_location_uses_bare_host and avoid generating any
host containing "global-aiplatform".
---
Nitpick comments:
In `@backend/app/core/config.py`:
- Line 118: The GEMINI_DEFAULT_INFERENCE_ROUTE setting is currently a free-form
string, so invalid routes are only discovered later in
registry.get_provider_class and the allowed values are duplicated as magic
strings. Update the configuration field in config.py to use a Literal or shared
Enum, mirroring the ENVIRONMENT setting style, and make
registry.get_provider_class consume the same shared source of truth so startup
validation rejects bad values and the route names are defined only once.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b6148bdf-69df-4c97-a1c2-2cdf50cf9298
📒 Files selected for processing (4)
backend/app/core/config.pybackend/app/services/llm/providers/google_ai.pybackend/app/services/llm/providers/registry.pybackend/app/tests/services/llm/providers/test_registry.py
…-routing-switch' into feat/gemini-default-routing-switch
| # same as above | ||
| GOOGLE: GoogleAIProvider, | ||
| GOOGLE_NATIVE: GoogleAIProvider, | ||
| # same as above |
There was a problem hiding this comment.
can we have better comments than this or remove them if not needed
| """Return the provider class for a given name.""" | ||
| if provider_type in (cls.GOOGLE, cls.GOOGLE_NATIVE): | ||
| route = settings.GEMINI_DEFAULT_INFERENCE_ROUTE | ||
| if route not in ("vertex", "aistudio"): |
There was a problem hiding this comment.
can you check this case
Default now vertex. Path:
- provider_type="google" → GoogleVertexAIProvider → lookup key "google".
- Migration 069 moved old AI-Studio rows INTO "google" key. Those rows hold AI-Studio api_key only — no project_id/location.
- create_client (google_ai.py:121): api_key = credentials.get("api_key"). Grabs AI-Studio key, feeds to Vertex as Vertex key.
- Vertex reject AI-Studio key → 401/403 at call time.
Break = existing project with per-project google row. Runtime fail, not startup. Need migration or confirm "all use platform defaults now".
Overall, google -> vertex -> google seems confusing, can we have consistent name if possible
There was a problem hiding this comment.
🧹 Nitpick comments (2)
backend/app/tests/services/llm/providers/test_registry.py (2)
237-241: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueDuplicated BYOK credential fixture literal.
The same
{"api_key": ..., "project_id": ..., "location": "us-central1"}-shaped dict is repeated across multiple tests (also at line 152-156). Consider extracting a small helper/fixture (e.g.,_byok_vertex_credentials()) to reduce duplication and keep field expectations in one place.Also applies to: 310-314
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/tests/services/llm/providers/test_registry.py` around lines 237 - 241, The BYOK credential dict literal is duplicated in multiple test cases, so centralize it in a shared helper/fixture such as _byok_vertex_credentials() inside test_registry.py. Update the affected tests to call that helper instead of repeating the api_key/project_id/location mapping, so the expected BYOK fields live in one place and can be reused consistently.
224-224: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueMissing return type hints on new test methods.
The new test methods (
test_aistudio_caller_overridden_to_vertex_when_env_vertex,test_google_caller_overridden_to_aistudio_when_env_aistudio,test_env_unset_respects_caller_choice) omit a-> Nonereturn annotation, consistent with the pre-existing file pattern but not compliant with the stated guideline.🔧 Proposed fix
- def test_aistudio_caller_overridden_to_vertex_when_env_vertex(self, db: Session): + def test_aistudio_caller_overridden_to_vertex_when_env_vertex(self, db: Session) -> None:(similarly for the other two new methods)
As per coding guidelines, "Use type hints on every function parameter and return value;
-> Anyis not acceptable unless the type can be narrowed."Also applies to: 258-258, 297-297
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/tests/services/llm/providers/test_registry.py` at line 224, Add explicit return type annotations to the new test methods in the test_registry test class: `test_aistudio_caller_overridden_to_vertex_when_env_vertex`, `test_google_caller_overridden_to_aistudio_when_env_aistudio`, and `test_env_unset_respects_caller_choice` should each be declared with `-> None` to match the project’s typing guideline and the rest of the file’s test signatures.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@backend/app/tests/services/llm/providers/test_registry.py`:
- Around line 237-241: The BYOK credential dict literal is duplicated in
multiple test cases, so centralize it in a shared helper/fixture such as
_byok_vertex_credentials() inside test_registry.py. Update the affected tests to
call that helper instead of repeating the api_key/project_id/location mapping,
so the expected BYOK fields live in one place and can be reused consistently.
- Line 224: Add explicit return type annotations to the new test methods in the
test_registry test class:
`test_aistudio_caller_overridden_to_vertex_when_env_vertex`,
`test_google_caller_overridden_to_aistudio_when_env_aistudio`, and
`test_env_unset_respects_caller_choice` should each be declared with `-> None`
to match the project’s typing guideline and the rest of the file’s test
signatures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1efc2240-7b83-4cfd-ab6f-730b3dca73db
📒 Files selected for processing (4)
backend/app/core/config.pybackend/app/services/llm/providers/google_ai.pybackend/app/services/llm/providers/registry.pybackend/app/tests/services/llm/providers/test_registry.py
🚧 Files skipped from review as they are similar to previous changes (3)
- backend/app/services/llm/providers/google_ai.py
- backend/app/core/config.py
- backend/app/services/llm/providers/registry.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/app/tests/services/llm/providers/test_registry.py (1)
262-283: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAistudio fallback test doesn't verify the actual API key used.
Unlike
test_google_vertex_route_falls_back_to_platform_defaults(which assertsprovider.client.api_key == "platform-key"), this test only checksisinstance(provider, GoogleAIProvider)and never asserts thatprovider.clientwas built with"platform-gemini-key". This leaves a gap: a regression that silently drops the settings fallback wouldn't be caught here. This is also tied to the fallback-behavior contradiction flagged ingoogle_aistudio.py'screate_client.✅ Suggested assertion
provider = get_llm_provider( session=db, provider_type="google", project_id=project.id, organization_id=project.organization_id, ) assert isinstance(provider, GoogleAIProvider) + assert provider.client._api_client.api_key == "platform-gemini-key"(exact attribute path depends on the
genai.Clientinternals used elsewhere to verify key propagation.)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/tests/services/llm/providers/test_registry.py` around lines 262 - 283, The Google AI Studio fallback test only checks the provider type and misses verifying that the platform default API key is actually passed through. Update test_google_aistudio_route_falls_back_to_platform_defaults to assert the created GoogleAIProvider’s client was initialized with the expected platform key, following the same pattern used in test_google_vertex_route_falls_back_to_platform_defaults and targeting the client created by get_llm_provider/create_client.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/services/llm/providers/google_aistudio.py`:
- Around line 59-62: The Google Aistudio client creation flow currently falls
back to settings.GEMINI_API_KEY in create_client, which conflicts with the
intended no-fallback behavior for the aistudio route. Update create_client so it
only uses the stored credentials["api_key"] and raises when that is missing, and
adjust or remove the test_google_aistudio_route_falls_back_to_platform_defaults
expectation so platform-routed Google traffic does not silently use a shared
platform key.
---
Nitpick comments:
In `@backend/app/tests/services/llm/providers/test_registry.py`:
- Around line 262-283: The Google AI Studio fallback test only checks the
provider type and misses verifying that the platform default API key is actually
passed through. Update
test_google_aistudio_route_falls_back_to_platform_defaults to assert the created
GoogleAIProvider’s client was initialized with the expected platform key,
following the same pattern used in
test_google_vertex_route_falls_back_to_platform_defaults and targeting the
client created by get_llm_provider/create_client.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cad84434-4d84-495b-a706-17937aff6391
📒 Files selected for processing (6)
backend/app/core/config.pybackend/app/models/llm/constants.pybackend/app/services/llm/providers/google_aistudio.pybackend/app/services/llm/providers/registry.pybackend/app/tests/services/llm/providers/test_registry.pybackend/docs/gemini-routing-flows.md
✅ Files skipped from review due to trivial changes (1)
- backend/docs/gemini-routing-flows.md
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/app/core/config.py
- backend/app/services/llm/providers/registry.py
| api_key = credentials.get("api_key") or settings.GEMINI_API_KEY | ||
| if not api_key: | ||
| raise ValueError("API Key for Google Gemini Not Set") | ||
| return genai.Client(api_key=credentials["api_key"]) | ||
| return genai.Client(api_key=api_key) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major
Aistudio platform fallback contradicts PR's stated "no fallback" behavior.
The PR objectives state that for the aistudio route, missing user credentials should raise an error "and there are no platform-level fallbacks." But this create_client falls back to settings.GEMINI_API_KEY whenever credentials["api_key"] is absent, and test_google_aistudio_route_falls_back_to_platform_defaults explicitly asserts this fallback succeeds for platform-routed "google" traffic. Please confirm which behavior is intended — if the PR description is authoritative, this fallback (and its supporting test) should be removed/gated so platform-routed aistudio without stored credentials raises instead of silently using a shared platform key.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/services/llm/providers/google_aistudio.py` around lines 59 - 62,
The Google Aistudio client creation flow currently falls back to
settings.GEMINI_API_KEY in create_client, which conflicts with the intended
no-fallback behavior for the aistudio route. Update create_client so it only
uses the stored credentials["api_key"] and raises when that is missing, and
adjust or remove the test_google_aistudio_route_falls_back_to_platform_defaults
expectation so platform-routed Google traffic does not silently use a shared
platform key.
Issue #982
Earlier there used to be no easy way to toggle between default Gemini Inference Provider i.e Google AIStudio and Google Vertex {Agent Platform) without making changes to
registry.py. This PR resolves the issue by introducing an env variableGEMINI_DEFAULT_INFERENCE_ROUTE=vertex/aistudioand route accordingly.Case 1
Set GEMINI_DEFAULT_INFERENCE_ROUTE=vertex
All STT/TTS calls are routed via GoogleVertexAIProvider
If the user has stored their own GCP keys
Use them to make the call (DB lookup by provider=”google”)
Else:
Use the Kaapi level defaults referenced in the .env
Note: All llm_calls/llm_chain/STS set by client in the config param “provider=”google-aistudio” will be overridden to “google”.
Case 2
Set GEMINI_DEFAULT_INFERENCE_ROUTE=aistudio
All STT/TTS calls are routed via GoogleAIProvider
If the user has not stored their own Google AIstudio API keys:
Raise error
Unlike vertex there are no platform level fallbacks here.
Note: All llm_calls/llm_chain/STS set by client config param “provider=”google”” would be overridden.
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Summary by CodeRabbit
New Features
-nativevariants.Bug Fixes
Documentation
Tests