Skip to content

fix(vscode-lm): reliable auto context condensing#710

Open
simurg79 wants to merge 3 commits into
Zoo-Code-Org:mainfrom
simurg79:fix/vscode-lm-condense
Open

fix(vscode-lm): reliable auto context condensing#710
simurg79 wants to merge 3 commits into
Zoo-Code-Org:mainfrom
simurg79:fix/vscode-lm-condense

Conversation

@simurg79

@simurg79 simurg79 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Related GitHub Issue

Closes #714

Description

Fixes unreliable automatic context condensing on the VS Code LM (vscode-lm) provider. The provider reports maxTokens: -1 (unlimited) and an inflated live context window, so the auto-condense gate computed usage against the wrong denominator and effectively never fired even when the UI context gauge showed the window as full.

  • Treat maxTokens: -1 (unlimited) as the default output reserve in both willManageContext and manageContext instead of letting a negative value distort the window math.
  • Measure usage against the available input space (contextWindow - reservedForOutput), matching the UI gauge, with a safe fallback to the full window when the reserve is unknown/unlimited.
  • Add an optional getCondenseContextWindow() ApiHandler seam; only the vscode-lm provider overrides it to drive the gate from the curated static-table maxInputTokens. Every other provider falls back to modelInfo.contextWindow (behavior unchanged).
  • Refresh the VS Code LM model catalog and default model (claude-sonnet-4.5).
  • UI guards so the context bar treats a negative maxTokens as zero reserve and resolves an unlisted family to the default-model window.

Test Procedure

  • packages/typesvscode-llm.spec.ts: model catalog invariants.
  • srccontext-management.spec.ts: reserve guard + available-input denominator, including the availableInputTokens <= 0 → 100% fallback and end-to-end manageContext summarization.
  • srcvscode-lm.spec.ts: getCondenseContextWindow() resolution (static family, live fallback, non-positive guard).
  • webview-uiTaskHeader.spec.tsx, useSelectedModel.spec.ts: UI reserve/window guards.
  • Full CI: all checks green; codecov/patch at 88.88% (≥ 80% target).

Pre-Submission Checklist

  • Issue Linked: Closes vscode-lm: automatic context condensing never triggers (maxTokens -1 + inflated window) #714 (see "Related GitHub Issue" above).
  • Scope: Changes are scoped to the vscode-lm auto-condense fix (one fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: Added/updated unit tests for the changed logic.
  • Documentation Impact: Considered; no user-facing docs required (behavior aligns auto-condense with the existing context gauge).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.
  • No extension version bump (Zoo-Code release versioning is maintainer-managed); no agent-created changeset.
  • All CI checks pass.

Documentation Updates

  • No documentation updates are required; behavior aligns auto-condense with the existing context gauge.

Additional Notes

Port of simurg79/Roo-Code#11 into Zoo-Code. Applied by context (paths map 1:1); the upstream version bump was intentionally omitted.

Summary by CodeRabbit

  • New Features

    • Improved support for the VS Code LLM provider, including better model selection and more accurate context-window handling.
  • Bug Fixes

    • Fixed context management so token limits and auto-summarization use the correct available input space, including models with unlimited output settings.
    • Improved the chat header’s displayed context usage to avoid negative token counts affecting the percentage shown.

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Updates the VS Code LLM provider model catalog (new Claude 4.x, GPT 5.x, Gemini 3.x entries) and introduces a getCondenseContextWindow() seam on ApiHandler/VsCodeLmHandler to separate the advertised contextWindow from the effective maxInputTokens used for condense-gate decisions. Context-management threshold math is changed to compute against available input space, with maxTokens: -1 treated as unlimited throughout.

Changes

VS Code LM condense gate fix

Layer / File(s) Summary
Catalog and API contract
packages/types/src/providers/vscode-llm.ts, src/api/index.ts, packages/types/src/__tests__/vscode-llm.spec.ts
vscodeLlmDefaultModelId changed to "claude-sonnet-4.5", vscodeLlmModels replaced with a curated catalog distinguishing contextWindow from maxInputTokens, and getCondenseContextWindow?() added as an optional method on ApiHandler. Catalog tests validate field positivity, exclusions, and default-model presence.
VsCodeLmHandler condense window
src/api/providers/vscode-lm.ts, src/api/providers/__tests__/vscode-lm.spec.ts
getCondenseContextWindow() added to VsCodeLmHandler, looking up vscodeLlmModels[family].maxInputTokens with fallback to getModel().info.contextWindow. Tests cover large advertised values, small values, non-numeric fallback, static-table hits, unknown-family fallback, no-family fallback, and non-positive static entry guard.
Context management math
src/core/context-management/index.ts, src/core/context-management/__tests__/context-management.spec.ts
willManageContext and manageContext now treat non-positive maxTokens as unlimited and compute contextPercent against available input space (contextWindow − reservedForOutput) rather than the raw window. Regression tests cover the updated threshold math, maxTokens: -1, and reserve-≥-window edge cases.
Task condense-window wiring
src/core/task/Task.ts
Both the truncation path and the context-management pre-check in attemptApiRequest now read getCondenseContextWindow?.() when available, falling back to modelInfo.contextWindow. Abort-signal listeners are reformatted to multiline addEventListener with { once: true }.
Webview model selection and display
webview-ui/src/components/ui/hooks/useSelectedModel.ts, webview-ui/src/components/chat/TaskHeader.tsx, webview-ui/src/components/chat/__tests__/TaskHeader.spec.tsx, webview-ui/src/components/ui/hooks/__tests__/useSelectedModel.spec.ts
TaskHeader clamps negative maxTokens to zero for reserved output math. The vscode-lm branch of useSelectedModel introduces a listedModel fallback to the default model and sets contextWindow explicitly from listedModel.maxInputTokens. Tests cover the unlimited-output display case and listed vs. unlisted family resolution.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

awaiting-review

Suggested reviewers

  • taltas
  • hannesrudolph
  • edelauna
  • navedmerchant

Poem

🐇 A window's a window, or so went the lore,
But maxInputTokens said "not quite — there's more!"
The condense gate now measures available space,
Negative reserves get a firm zero-embrace.
Claude Sonnet 4.5 hops in as the new face! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: fixing unreliable auto context condensing in the vscode-lm provider.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description covers the required issue link, summary, tests, checklist, docs, and notes; only optional sections like screenshots/contact are omitted.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.changeset/vscode-lm-condense-fix.md:
- Around line 1-5: The changeset file `.changeset/vscode-lm-condense-fix.md` was
created but should not be included in this PR since changesets are managed by
maintainers outside the normal development workflow. Delete the entire
`.changeset/vscode-lm-condense-fix.md` file and allow maintainers to create the
proper changeset entry separately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 011fdda6-5a92-4a6e-a80c-e1b4d20c91ab

📥 Commits

Reviewing files that changed from the base of the PR and between e8acc6a and 1eadaea.

📒 Files selected for processing (13)
  • .changeset/vscode-lm-condense-fix.md
  • packages/types/src/__tests__/vscode-llm.spec.ts
  • packages/types/src/providers/vscode-llm.ts
  • src/api/index.ts
  • src/api/providers/__tests__/vscode-lm.spec.ts
  • src/api/providers/vscode-lm.ts
  • src/core/context-management/__tests__/context-management.spec.ts
  • src/core/context-management/index.ts
  • src/core/task/Task.ts
  • webview-ui/src/components/chat/TaskHeader.tsx
  • webview-ui/src/components/chat/__tests__/TaskHeader.spec.tsx
  • webview-ui/src/components/ui/hooks/__tests__/useSelectedModel.spec.ts
  • webview-ui/src/components/ui/hooks/useSelectedModel.ts

Comment thread .changeset/vscode-lm-condense-fix.md Outdated
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 88.88889% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/core/task/Task.ts 77.77% 2 Missing ⚠️
src/core/context-management/index.ts 88.88% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Add targeted tests for the previously-uncovered ported branches: the availableInputTokens<=0 fallback to 100% in willManageContext/manageContext, getCondenseContextWindow() guard fallbacks, and the vscode-lm UI family-miss window resolution. Raises patch coverage to satisfy the codecov/patch 80% gate.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/api/providers/__tests__/vscode-lm.spec.ts`:
- Around line 504-521: The test is mutating the static model row for the
selector family, but the mocked client currently uses a different family so
VsCodeLmHandler never consults that row. Update the test setup in the
VsCodeLmHandler/getCondenseContextWindow case so the mockLanguageModelChat
family matches "claude-opus-4.8", or avoid assigning client so the selector
family path is used; this ensures the zeroed maxInputTokens row is actually
exercised and the live-window fallback assertion is valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7cff98c0-cb87-4d62-9e6d-29e09bf5e9b0

📥 Commits

Reviewing files that changed from the base of the PR and between 1eadaea and 62a556c.

📒 Files selected for processing (2)
  • src/api/providers/__tests__/vscode-lm.spec.ts
  • src/core/context-management/__tests__/context-management.spec.ts

Comment thread src/api/providers/__tests__/vscode-lm.spec.ts Outdated
…w guard test

- Remove .changeset/vscode-lm-condense-fix.md (changesets are maintainer-managed per AGENTS.md; CodeRabbit flagged).

- Fix getCondenseContextWindow() non-positive-guard test so the selector family (claude-opus-4.8) drives the lookup and the zeroed static row actually exercises the maxInputTokens > 0 guard before falling back.

@edelauna edelauna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this, had some comments around the implementation.

Comment on lines +318 to +320
// contextWindow MUST equal maxInputTokens: that is the exact value the gate consumes via
// getModel().info.contextWindow = Math.max(0, client.maxInputTokens) in src/api/providers/vscode-lm.ts,
// so the UI bar and the condense gate share a single source of truth.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says the gate consumes getModel().info.contextWindow, but Task.ts now calls getCondenseContextWindow?.() as the primary path (which returns the static table's maxInputTokens). getModel() is the fallback, not the primary. Worth updating the comment so it doesn't mislead future readers?

expect(result.current.provider).toBe("vscode-lm")
expect(result.current.id).toBe(`copilot/${family}`)
// The bar and the condense gate share one source of truth: contextWindow === maxInputTokens.
expect(result.current.info?.contextWindow).toBe(vscodeLlmModels[family].maxInputTokens)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses vscodeLlmDefaultModelId (claude-sonnet-4.5) where contextWindow === maxInputTokens === 167790. If someone accidentally swapped listedModel.maxInputTokens for listedModel.contextWindow on line 324 of useSelectedModel.ts, this test would still pass. Adding one test with claude-opus-4.8 (the only row where contextWindow: 679560maxInputTokens: 197897) would catch that mutation.

Comment on lines +202 to +204
const reservedForOutput = maxTokens && maxTokens > 0 ? maxTokens : 0
const availableInputTokens = contextWindow - reservedForOutput
const contextPercent = availableInputTokens > 0 ? (100 * prevContextTokens) / availableInputTokens : 100

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the contextPercent denominator from contextWindow to contextWindow - maxTokens for every provider where maxTokens > 0, not just vscode-lm. For example, Anthropic with a 200K window and maxTokens=8192 will see condense fire ~4% earlier than before. Was that intentional? If so, might be worth a note in the PR description since it's a behavioral change for all providers.

@github-actions github-actions Bot added the awaiting-author PR is waiting for the author to address requested changes label Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-author PR is waiting for the author to address requested changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vscode-lm: automatic context condensing never triggers (maxTokens -1 + inflated window)

2 participants