Skip to content

Gemini: Toggleable Inference Routing #982

Open
Prajna1999 wants to merge 9 commits into
mainfrom
feat/gemini-default-routing-switch
Open

Gemini: Toggleable Inference Routing #982
Prajna1999 wants to merge 9 commits into
mainfrom
feat/gemini-default-routing-switch

Conversation

@Prajna1999

@Prajna1999 Prajna1999 commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Issue #982

Earlier there used to be no easy way to toggle between default Gemini Inference Provider i.e Google AIStudio and Google Vertex {Agent Platform) without making changes to registry.py. This PR resolves the issue by introducing an env variable GEMINI_DEFAULT_INFERENCE_ROUTE=vertex/aistudio and route accordingly.

Case 1

  • Set GEMINI_DEFAULT_INFERENCE_ROUTE=vertex

  • All STT/TTS calls are routed via GoogleVertexAIProvider

  • If the user has stored their own GCP keys
    Use them to make the call (DB lookup by provider=”google”)
    Else:
    Use the Kaapi level defaults referenced in the .env

  • Note: All llm_calls/llm_chain/STS set by client in the config param “provider=”google-aistudio” will be overridden to “google”.

Case 2

  • Set GEMINI_DEFAULT_INFERENCE_ROUTE=aistudio

  • All STT/TTS calls are routed via GoogleAIProvider

  • If the user has not stored their own Google AIstudio API keys:
    Raise error

  • Unlike vertex there are no platform level fallbacks here.

  • Note: All llm_calls/llm_chain/STS set by client config param “provider=”google”” would be overridden.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Summary by CodeRabbit

  • New Features

    • Added a configurable default Gemini inference route (defaults to AI Studio) to control whether requests prefer Vertex AI or AI Studio.
    • Improved Google/Gemini provider selection and routing, including support for -native variants.
  • Bug Fixes

    • Corrected Vertex endpoint hostname formatting for regional/global requests.
    • Improved credential and API key fallback behavior to reduce misrouting and missing-configuration failures.
  • Documentation

    • Added a guide explaining Gemini routing flows and error cases.
  • Tests

    • Expanded routing tests for explicit providers, env-driven overrides, and credential fallback scenarios.

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds Gemini route configuration, expands Google provider naming and routing, changes Vertex host construction, updates credential fallback handling, and adds tests plus routing documentation.

Changes

Gemini provider routing update

Layer / File(s) Summary
Config and provider types
backend/app/core/config.py, backend/app/models/llm/constants.py, backend/app/services/llm/providers/google_ai.py
Adds GEMINI_DEFAULT_INFERENCE_ROUTE, introduces google-vertex provider names, extends native provider naming, and changes Vertex endpoint host selection by location.
Provider routing and credential selection
backend/app/services/llm/providers/google_aistudio.py, backend/app/services/llm/providers/registry.py
Adds AI Studio API key fallback to settings, routes platform-selected Google providers by GEMINI_DEFAULT_INFERENCE_ROUTE, updates provider-key lookup, and changes missing-credential handling for routed Google providers.
Registry and routing tests
backend/app/tests/services/llm/providers/test_registry.py
Expands coverage for explicit Google Vertex and AI Studio providers, platform-driven routing, and fallback behavior when provider credentials are absent.
Gemini routing documentation
backend/docs/gemini-routing-flows.md
Documents provider naming, routing flows, edge cases, decision tables, and noted gaps for Gemini routing.

Estimated code review effort: 4 (Complex) | ~60 minutes

Possibly related issues

Possibly related PRs

Suggested labels: enhancement

Suggested reviewers: Ayush8923, vprashrex

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: making Gemini inference routing configurable.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/gemini-default-routing-switch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@Prajna1999 Prajna1999 marked this pull request as ready for review June 29, 2026 13:27
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

OpenAPI changes   🟢 8 non-breaking changes

Tip

Safe to merge from an API-contract perspective.

Full changelog  ·  8
Method Path Change
🟢 POST /api/v1/configs added the new google-vertex-native enum value to the request property config_blob/completion/anyOf[subschema #1: NativeCompletionConfig]/provider
🟢 POST /api/v1/llm/call added the new google-vertex-native enum value to the request property config/blob/anyOf[subschema #1: ConfigBlob]/completion/anyOf[subschema #1: NativeCompletionConfig]/provider
🟢 POST /api/v1/llm/chain added the new google-vertex-native enum value to the request property blocks/items/config/blob/anyOf[subschema #1: ConfigBlob]/completion/anyOf[subschema #1: NativeCompletionConfig]/provider
🟢 GET /api/v1/models added the enum value google-vertex to the property anyOf[subschema #1: Provider]/ of the query request parameter provider
🟢 PATCH /api/v1/models added the new google-vertex enum value to the request property items/provider
🟢 DELETE /api/v1/models/{provider}/{model_name} added the new enum value google-vertex to the path request parameter provider
🟢 GET /api/v1/models/{provider}/{model_name} added the new enum value google-vertex to the path request parameter provider
🟢 PATCH /api/v1/models/{provider}/{model_name} added the new enum value google-vertex to the path request parameter provider

main16639921 · generated by oasdiff

@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.00000% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/services/llm/providers/google_ai.py 0.00% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Prajna1999 Prajna1999 changed the title feat: default google credes Gemini: Toggleable Inference Routing Jun 30, 2026
@Prajna1999 Prajna1999 self-assigned this Jun 30, 2026
@Prajna1999 Prajna1999 linked an issue Jun 30, 2026 that may be closed by this pull request

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/services/llm/providers/google_ai.py (1)

96-102: 🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win

Critical: global location produces an invalid host and a 404.

This unconditionally prefixes the location, so for location="global" the host becomes global-aiplatform.googleapis.com, which does not resolve and returns 404. The accompanying test test_global_location_uses_bare_host in backend/app/tests/services/llm/providers/test_google_ai.py explicitly asserts the global location must use the bare aiplatform.googleapis.com host (and that "global-aiplatform" must not appear), noting it caught a real outage. The current code will fail that test and break global-configured projects.

🐛 Proposed fix
-        host = f"{self.location}-aiplatform.googleapis.com"
+        host = (
+            "aiplatform.googleapis.com"
+            if self.location == "global"
+            else f"{self.location}-aiplatform.googleapis.com"
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/services/llm/providers/google_ai.py` around lines 96 - 102, The
Google AI provider endpoint builder in endpoint currently always prefixes
self.location into the host, which breaks the global case. Update
endpoint(model) so that when location is "global" it uses the bare
aiplatform.googleapis.com host, and only prefixes non-global locations; keep the
rest of the URL construction unchanged. This should align with
test_google_ai.py’s test_global_location_uses_bare_host and avoid generating any
host containing "global-aiplatform".
🧹 Nitpick comments (1)
backend/app/core/config.py (1)

118-118: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Consider constraining the type to a Literal to validate at config-load time.

GEMINI_DEFAULT_INFERENCE_ROUTE is a free-form str, so an invalid value is only caught later in registry.get_provider_class(), and the allowed values "vertex"/"aistudio" are magic strings duplicated across config.py and registry.py. Mirroring the ENVIRONMENT field (Line 42) with a Literal (or shared Enum) lets pydantic reject bad values at startup and removes the duplicated literals.

♻️ Proposed change
-    GEMINI_DEFAULT_INFERENCE_ROUTE: str = "aistudio"
+    GEMINI_DEFAULT_INFERENCE_ROUTE: Literal["vertex", "aistudio"] = "aistudio"

As per coding guidelines: "Do not use magic values in code; extract repeated literals into constants, Enum members, or settings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/core/config.py` at line 118, The GEMINI_DEFAULT_INFERENCE_ROUTE
setting is currently a free-form string, so invalid routes are only discovered
later in registry.get_provider_class and the allowed values are duplicated as
magic strings. Update the configuration field in config.py to use a Literal or
shared Enum, mirroring the ENVIRONMENT setting style, and make
registry.get_provider_class consume the same shared source of truth so startup
validation rejects bad values and the route names are defined only once.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/services/llm/providers/registry.py`:
- Around line 54-61: The provider routing check in the registry branch for
Google providers uses duplicated magic strings and a malformed error message.
Update the validation in the registry logic to rely on the shared setting
type/constant from config.py (for example the Literal/Enum-backed definition)
instead of hardcoding "vertex" and "aistudio", and rewrite the ValueError text
so it reads cleanly with proper spacing and a single source of truth for the
allowed values.

---

Outside diff comments:
In `@backend/app/services/llm/providers/google_ai.py`:
- Around line 96-102: The Google AI provider endpoint builder in endpoint
currently always prefixes self.location into the host, which breaks the global
case. Update endpoint(model) so that when location is "global" it uses the bare
aiplatform.googleapis.com host, and only prefixes non-global locations; keep the
rest of the URL construction unchanged. This should align with
test_google_ai.py’s test_global_location_uses_bare_host and avoid generating any
host containing "global-aiplatform".

---

Nitpick comments:
In `@backend/app/core/config.py`:
- Line 118: The GEMINI_DEFAULT_INFERENCE_ROUTE setting is currently a free-form
string, so invalid routes are only discovered later in
registry.get_provider_class and the allowed values are duplicated as magic
strings. Update the configuration field in config.py to use a Literal or shared
Enum, mirroring the ENVIRONMENT setting style, and make
registry.get_provider_class consume the same shared source of truth so startup
validation rejects bad values and the route names are defined only once.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b6148bdf-69df-4c97-a1c2-2cdf50cf9298

📥 Commits

Reviewing files that changed from the base of the PR and between d7a6e2f and fa54a02.

📒 Files selected for processing (4)
  • backend/app/core/config.py
  • backend/app/services/llm/providers/google_ai.py
  • backend/app/services/llm/providers/registry.py
  • backend/app/tests/services/llm/providers/test_registry.py

Comment thread backend/app/services/llm/providers/registry.py Outdated
Comment on lines +39 to +42
# same as above
GOOGLE: GoogleAIProvider,
GOOGLE_NATIVE: GoogleAIProvider,
# same as above

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have better comments than this or remove them if not needed

"""Return the provider class for a given name."""
if provider_type in (cls.GOOGLE, cls.GOOGLE_NATIVE):
route = settings.GEMINI_DEFAULT_INFERENCE_ROUTE
if route not in ("vertex", "aistudio"):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you check this case
Default now vertex. Path:

  • provider_type="google" → GoogleVertexAIProvider → lookup key "google".
  • Migration 069 moved old AI-Studio rows INTO "google" key. Those rows hold AI-Studio api_key only — no project_id/location.
  • create_client (google_ai.py:121): api_key = credentials.get("api_key"). Grabs AI-Studio key, feeds to Vertex as Vertex key.
  • Vertex reject AI-Studio key → 401/403 at call time.

Break = existing project with per-project google row. Runtime fail, not startup. Need migration or confirm "all use platform defaults now".

Overall, google -> vertex -> google seems confusing, can we have consistent name if possible

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
backend/app/tests/services/llm/providers/test_registry.py (2)

237-241: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Duplicated BYOK credential fixture literal.

The same {"api_key": ..., "project_id": ..., "location": "us-central1"}-shaped dict is repeated across multiple tests (also at line 152-156). Consider extracting a small helper/fixture (e.g., _byok_vertex_credentials()) to reduce duplication and keep field expectations in one place.

Also applies to: 310-314

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/tests/services/llm/providers/test_registry.py` around lines 237 -
241, The BYOK credential dict literal is duplicated in multiple test cases, so
centralize it in a shared helper/fixture such as _byok_vertex_credentials()
inside test_registry.py. Update the affected tests to call that helper instead
of repeating the api_key/project_id/location mapping, so the expected BYOK
fields live in one place and can be reused consistently.

224-224: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Missing return type hints on new test methods.

The new test methods (test_aistudio_caller_overridden_to_vertex_when_env_vertex, test_google_caller_overridden_to_aistudio_when_env_aistudio, test_env_unset_respects_caller_choice) omit a -> None return annotation, consistent with the pre-existing file pattern but not compliant with the stated guideline.

🔧 Proposed fix
-    def test_aistudio_caller_overridden_to_vertex_when_env_vertex(self, db: Session):
+    def test_aistudio_caller_overridden_to_vertex_when_env_vertex(self, db: Session) -> None:

(similarly for the other two new methods)

As per coding guidelines, "Use type hints on every function parameter and return value; -> Any is not acceptable unless the type can be narrowed."

Also applies to: 258-258, 297-297

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/tests/services/llm/providers/test_registry.py` at line 224, Add
explicit return type annotations to the new test methods in the test_registry
test class: `test_aistudio_caller_overridden_to_vertex_when_env_vertex`,
`test_google_caller_overridden_to_aistudio_when_env_aistudio`, and
`test_env_unset_respects_caller_choice` should each be declared with `-> None`
to match the project’s typing guideline and the rest of the file’s test
signatures.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@backend/app/tests/services/llm/providers/test_registry.py`:
- Around line 237-241: The BYOK credential dict literal is duplicated in
multiple test cases, so centralize it in a shared helper/fixture such as
_byok_vertex_credentials() inside test_registry.py. Update the affected tests to
call that helper instead of repeating the api_key/project_id/location mapping,
so the expected BYOK fields live in one place and can be reused consistently.
- Line 224: Add explicit return type annotations to the new test methods in the
test_registry test class:
`test_aistudio_caller_overridden_to_vertex_when_env_vertex`,
`test_google_caller_overridden_to_aistudio_when_env_aistudio`, and
`test_env_unset_respects_caller_choice` should each be declared with `-> None`
to match the project’s typing guideline and the rest of the file’s test
signatures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1efc2240-7b83-4cfd-ab6f-730b3dca73db

📥 Commits

Reviewing files that changed from the base of the PR and between fa54a02 and c490b46.

📒 Files selected for processing (4)
  • backend/app/core/config.py
  • backend/app/services/llm/providers/google_ai.py
  • backend/app/services/llm/providers/registry.py
  • backend/app/tests/services/llm/providers/test_registry.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • backend/app/services/llm/providers/google_ai.py
  • backend/app/core/config.py
  • backend/app/services/llm/providers/registry.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/app/tests/services/llm/providers/test_registry.py (1)

262-283: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Aistudio fallback test doesn't verify the actual API key used.

Unlike test_google_vertex_route_falls_back_to_platform_defaults (which asserts provider.client.api_key == "platform-key"), this test only checks isinstance(provider, GoogleAIProvider) and never asserts that provider.client was built with "platform-gemini-key". This leaves a gap: a regression that silently drops the settings fallback wouldn't be caught here. This is also tied to the fallback-behavior contradiction flagged in google_aistudio.py's create_client.

✅ Suggested assertion
             provider = get_llm_provider(
                 session=db,
                 provider_type="google",
                 project_id=project.id,
                 organization_id=project.organization_id,
             )

             assert isinstance(provider, GoogleAIProvider)
+            assert provider.client._api_client.api_key == "platform-gemini-key"

(exact attribute path depends on the genai.Client internals used elsewhere to verify key propagation.)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/tests/services/llm/providers/test_registry.py` around lines 262 -
283, The Google AI Studio fallback test only checks the provider type and misses
verifying that the platform default API key is actually passed through. Update
test_google_aistudio_route_falls_back_to_platform_defaults to assert the created
GoogleAIProvider’s client was initialized with the expected platform key,
following the same pattern used in
test_google_vertex_route_falls_back_to_platform_defaults and targeting the
client created by get_llm_provider/create_client.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/services/llm/providers/google_aistudio.py`:
- Around line 59-62: The Google Aistudio client creation flow currently falls
back to settings.GEMINI_API_KEY in create_client, which conflicts with the
intended no-fallback behavior for the aistudio route. Update create_client so it
only uses the stored credentials["api_key"] and raises when that is missing, and
adjust or remove the test_google_aistudio_route_falls_back_to_platform_defaults
expectation so platform-routed Google traffic does not silently use a shared
platform key.

---

Nitpick comments:
In `@backend/app/tests/services/llm/providers/test_registry.py`:
- Around line 262-283: The Google AI Studio fallback test only checks the
provider type and misses verifying that the platform default API key is actually
passed through. Update
test_google_aistudio_route_falls_back_to_platform_defaults to assert the created
GoogleAIProvider’s client was initialized with the expected platform key,
following the same pattern used in
test_google_vertex_route_falls_back_to_platform_defaults and targeting the
client created by get_llm_provider/create_client.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cad84434-4d84-495b-a706-17937aff6391

📥 Commits

Reviewing files that changed from the base of the PR and between 2adb86a and dc11e7b.

📒 Files selected for processing (6)
  • backend/app/core/config.py
  • backend/app/models/llm/constants.py
  • backend/app/services/llm/providers/google_aistudio.py
  • backend/app/services/llm/providers/registry.py
  • backend/app/tests/services/llm/providers/test_registry.py
  • backend/docs/gemini-routing-flows.md
✅ Files skipped from review due to trivial changes (1)
  • backend/docs/gemini-routing-flows.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/app/core/config.py
  • backend/app/services/llm/providers/registry.py

Comment on lines +59 to +62
api_key = credentials.get("api_key") or settings.GEMINI_API_KEY
if not api_key:
raise ValueError("API Key for Google Gemini Not Set")
return genai.Client(api_key=credentials["api_key"])
return genai.Client(api_key=api_key)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major

Aistudio platform fallback contradicts PR's stated "no fallback" behavior.

The PR objectives state that for the aistudio route, missing user credentials should raise an error "and there are no platform-level fallbacks." But this create_client falls back to settings.GEMINI_API_KEY whenever credentials["api_key"] is absent, and test_google_aistudio_route_falls_back_to_platform_defaults explicitly asserts this fallback succeeds for platform-routed "google" traffic. Please confirm which behavior is intended — if the PR description is authoritative, this fallback (and its supporting test) should be removed/gated so platform-routed aistudio without stored credentials raises instead of silently using a shared platform key.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/services/llm/providers/google_aistudio.py` around lines 59 - 62,
The Google Aistudio client creation flow currently falls back to
settings.GEMINI_API_KEY in create_client, which conflicts with the intended
no-fallback behavior for the aistudio route. Update create_client so it only
uses the stored credentials["api_key"] and raises when that is missing, and
adjust or remove the test_google_aistudio_route_falls_back_to_platform_defaults
expectation so platform-routed Google traffic does not silently use a shared
platform key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Routing: Environment variable for inference for Gemini

3 participants