Skip to content

[BUG] Condensing context fails with Error 400 on Local Openai compatible endpoint #718

Description

@huntyh

Describe the bug

Sometimes when using Zoo code with a local openai provider (SGLang endpoint with a local model), prompt condensing will fail with 400 Bad request on the chat completions endpoint.

To Reproduce
Steps to reproduce the behavior:

  1. Open a workspace
  2. Open Zoo code, send a request that requires the AI to read a large amount of code
  3. Context fills up, condensing context beings
  4. Error 400 appears
  5. The Agent can't do the task because the chat history will be continuously truncated

Expected behavior
Condensing context runs successfully

Screenshots

Image

What version of zoo are you running

Version: 3.62.0 (40660f1)

Additional context

Logs from SGLang only indicate a bad request:

[2026-06-25 10:10:33] Decode batch, #running-req: 1, #full token: 51338, full token usage: 0.66, mamba num: 2, mamba usage: 0.07, cuda graph: True, gen throughput (token/s): 35.49, #queue-req: 0

[2026-06-25 10:10:34] INFO:     10.1.0.105:50783 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request

Note: this started like a week ago, before then it worked perfectly and I did not change anything on the sglang side.

Note 2: When this happens, starting a new chat (task) with the exact same prompt can fix the issue, and condensing runs as normal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions