vscode-lm: automatic context condensing never triggers (maxTokens -1 + inflated window)

## Problem (one or two sentences)

On the **VS Code LM (`vscode-lm`) provider**, automatic context condensing does not trigger reliably. Even when the context gauge in the UI shows the window as effectively full, auto-condense never fires, so the conversation eventually overflows the model's real input limit.

## Context (who is affected and when)

Affects users on the VS Code LM (Copilot) provider who rely on automatic context condensing. It surfaces during long conversations with large-window models (e.g., a Claude/GPT-5 family entry): the UI context gauge climbs to full, but auto-condense never kicks in.

## Root cause

The `vscode-lm` provider reports `maxTokens: -1` (unlimited) and an inflated **live** context window (Copilot's advertised window, far larger than the realistic usable input). Two problems follow:

1. The condense gate computed `contextPercent` against the **full** context window instead of the **available input space** (`contextWindow - reservedForOutput`), so usage was under-reported and the threshold was never reached.
2. A negative `maxTokens` (`-1`) was used directly as the reserved-output value, distorting the window math (`maxTokens || DEFAULT` kept `-1`).

## Reproduction steps

1. Select the VS Code LM (Copilot) provider with a large-window model (e.g., a Claude/GPT-5 family entry).
2. Enable automatic context condensing with a normal threshold (e.g., 70–80%).
3. Drive a long conversation until the UI context gauge shows the window near/over full.
4. Observe that auto-condense does not trigger, despite the gauge indicating the context is effectively full.

## Expected result

Auto-condense should fire in line with the context gauge: usage should be measured against the usable input space, and the gate should use the model's real input ceiling (the curated `maxInputTokens`) rather than the inflated live window. A negative/unlimited `maxTokens` should fall back to a sane default reserve.

## Actual result

Auto-condense never triggers; usage is measured against the inflated full window, so the threshold is never reached and the conversation eventually overflows the model's real input limit.

## Variations tried

Reproduces across large-window vscode-lm models regardless of the configured condense threshold, because the denominator (full live window) is wrong rather than the threshold.

## App Version

N/A (provider-level context-management behavior; not tied to a specific release).

## API Provider

VS Code Language Model API (`vscode-lm` / Copilot).

## Model Used

Large-window vscode-lm models (e.g., a Claude/GPT-5 family entry).

## Fix

Addressed by #710:
- Treat `maxTokens: -1` (unlimited) as the default output reserve in `willManageContext`/`manageContext`.
- Measure `contextPercent` against available input space (`contextWindow - reservedForOutput`), with a safe fallback to the full window when the reserve is unknown.
- Add an optional `getCondenseContextWindow()` `ApiHandler` seam; `vscode-lm` overrides it to use the curated static `maxInputTokens`.
- Refresh the vscode-lm model catalog/default and add UI guards so the context bar and the gate share one source of truth.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vscode-lm: automatic context condensing never triggers (maxTokens -1 + inflated window) #714

Problem (one or two sentences)

Context (who is affected and when)

Root cause

Reproduction steps

Expected result

Actual result

Variations tried

App Version

API Provider

Model Used

Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

vscode-lm: automatic context condensing never triggers (maxTokens -1 + inflated window) #714

Description

Problem (one or two sentences)

Context (who is affected and when)

Root cause

Reproduction steps

Expected result

Actual result

Variations tried

App Version

API Provider

Model Used

Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions