Skip to content

fix: streaming tool-call argument loss in NativeToolCallParser (#695)#700

Open
awschmeder wants to merge 3 commits into
Zoo-Code-Org:mainfrom
awschmeder:fix/695-toolcall-dropped-leading-deltas
Open

fix: streaming tool-call argument loss in NativeToolCallParser (#695)#700
awschmeder wants to merge 3 commits into
Zoo-Code-Org:mainfrom
awschmeder:fix/695-toolcall-dropped-leading-deltas

Conversation

@awschmeder

@awschmeder awschmeder commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Related GitHub Issue

Closes: #695

Description

When a provider streams a tool call whose first delta(s) arrive before the tool-call id is known, those leading argument bytes are silently discarded by NativeToolCallParser.processRawChunk. This causes downstream "missing required parameter" and other spurious provider-dependent errors even when the model supplied the correct tool use syntax.

This PR fixes the issue by centralizing the tracking of streaming tool calls in NativeToolCallParser. The rawChunkTracker is now initialized on the first sight of a stream index, independent of whether an id is present. All arguments deltas are buffered until both id and name are known, ensuring no data loss during streaming reassembly.

Test Procedure

  1. Ran the newly added unit test in src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts which verifies that leading argument bytes arriving before the id are correctly preserved and finalized.
  2. Verified that existing provider tests in the same test file pass.

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue.
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes.
  • Documentation Impact: I have considered if my changes require documentation updates.
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

N/A

Documentation Updates

  • No documentation updates are required.

Additional Notes

N/A

Get in Touch

@awschmeder

Summary by CodeRabbit

  • Bug Fixes

    • Resolved streaming tool calls dropping argument bytes when they arrive before tool IDs, preventing "missing parameter" errors in streaming scenarios.
  • Changes

    • Updated follow-up suggestion tool parameter: now accepts 1–4 suggestions (was 2–4), requires array format even for single items.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

NativeToolCallParser is fixed to initialize a tracker on first index observation (rather than waiting for id), buffer argument deltas until both id and name are known, then flush on start and guard end events to prevent phantom emissions. New regression tests cover the edge cases. The ask_followup_question tool description changes "2-4 suggestions" to "1-4 array."

Changes

NativeToolCallParser Streaming Reassembly Fix

Layer / File(s) Summary
Tracker type, initialization, and event buffering logic
src/core/assistant-message/NativeToolCallParser.ts, prs/fix-toolcall-dropped-leading-deltas.md
Tracker id is made optional; tracker is now created on first index observation regardless of whether id has arrived. processRawChunk records id and name incrementally and emits tool_call_start only when both are present, flushing any buffered deltas immediately after. processFinishReason and finalizeRawChunks both add a guard requiring a non-empty id in addition to hasStarted before emitting tool_call_end. PR doc describes the root cause and fix.
Regression tests for reassembly edge cases and finalizeRawChunks
src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts
Adds a processRawChunk streaming reassembly suite with a drive helper that feeds raw chunks through processRawChunk, routes events to startStreamingToolCall/processStreamingChunk, and finalizes via finalizeStreamingToolCall. Tests cover: arguments arriving before id, id and name in separate chunks, parallel-index isolation, single-chunk event sequence, no phantom tool_call_end without an id, and finalizeRawChunks behavior with and without a tracker id.

ask_followup_question Description Wording Update

Layer / File(s) Summary
follow_up description wording: 2-4 list → 1-4 array
src/core/prompts/tools/native-tools/ask_followup_question.ts
ASK_FOLLOWUP_QUESTION_DESCRIPTION and FOLLOW_UP_PARAMETER_DESCRIPTION are updated to require an array of 1–4 suggested answers, explicitly including the single-suggestion case.

Sequence Diagram(s)

sequenceDiagram
  participant Provider as Provider Stream
  participant PRC as processRawChunk
  participant T as rawChunkTracker[index]
  participant E as ToolCallStreamEvents
  participant FST as finalizeStreamingToolCall

  rect rgba(255, 165, 0, 0.5)
    note over Provider,T: Before fix — leading args dropped
    Provider->>PRC: {index:0, arguments:"{\"q"}
    PRC-->>E: (no tracker → early return, args lost)
    Provider->>PRC: {index:0, id:"call_1", name:"search"}
    PRC->>T: create tracker (id="call_1")
    PRC->>E: tool_call_start
  end

  rect rgba(0, 128, 0, 0.5)
    note over Provider,FST: After fix — leading args buffered
    Provider->>PRC: {index:0, arguments:"{\"q"}
    PRC->>T: create tracker (id=undefined), buffer args
    Provider->>PRC: {index:0, id:"call_1", name:"search"}
    PRC->>T: record id and name
    PRC->>E: tool_call_start
    PRC->>E: tool_call_delta (flush buffer: "{\"q")
    Provider->>PRC: {index:0, arguments:"uery\"}"}
    PRC->>E: tool_call_delta immediately
    Provider->>PRC: finish_reason="tool_calls"
    PRC->>E: tool_call_end (hasStarted AND id present)
    PRC->>FST: finalizeStreamingToolCall("call_1")
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • taltas
  • JamesRobert20
  • hannesrudolph
  • navedmerchant
  • edelauna

Poem

🐇 A delta came early, before the id,
Lost in the stream — how rude and unkind!
Now I buffer each byte with a patient small paw,
And flush the whole queue without missing a claw.
No phantom "end" when no name was assigned —
The tool calls arrive with all params aligned! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main fix: streaming tool-call argument loss in NativeToolCallParser, directly addressing the core issue #695.
Linked Issues check ✅ Passed All code changes directly implement the fix requirements from #695: initializing tracker on first index sighting, buffering argument deltas before id/name arrival, and preventing data loss during streaming reassembly.
Out of Scope Changes check ✅ Passed The ask_followup_question.ts parameter description update refines the contract documentation (1-4 instead of 2-4) but appears to be documentation-only with no structural changes, making it a minor documentation clarification rather than a substantive code change.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description matches the required template and includes the linked issue, change summary, testing steps, checklist, and notes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.changeset/fix-toolcall-dropped-leading-deltas.md (1)

1-6: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Remove this agent-generated changeset file from the PR.

This file conflicts with the repository’s changeset policy and should not be committed in this change.

As per coding guidelines: ".changeset/**: Do NOT create .changeset files for each commit or code change. Changesets are managed separately by maintainers and should not be generated by agents during normal development."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.changeset/fix-toolcall-dropped-leading-deltas.md around lines 1 - 6, The
.changeset/fix-toolcall-dropped-leading-deltas.md file was auto-generated and
conflicts with the repository's changeset policy which specifies that changeset
files should not be created by agents during normal development. Remove this
file from the PR entirely as changesets are managed separately by maintainers
only.

Source: Coding guidelines

🧹 Nitpick comments (1)
src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts (1)

373-380: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add one explicit test that exercises finalizeRawChunks() directly.

The new suite validates the processFinishReason end path, but this helper clears raw state instead of asserting the finalizeRawChunks guard that was also changed in this PR. A focused case for that path would harden regression coverage.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts` around
lines 373 - 380, Add a new focused test case within the test suite that directly
invokes the NativeToolCallParser.finalizeRawChunks() method to validate its
behavior. This test should verify that the guard logic in finalizeRawChunks
works correctly, since the current test validates the processFinishReason path
which uses clearRawChunkState instead. Include assertions that confirm the
expected end events are produced by finalizeRawChunks() and that raw state is
properly finalized, ensuring regression coverage for the changes made to this
method in this PR.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.changeset/fix-toolcall-dropped-leading-deltas.md:
- Around line 1-6: The .changeset/fix-toolcall-dropped-leading-deltas.md file
was auto-generated and conflicts with the repository's changeset policy which
specifies that changeset files should not be created by agents during normal
development. Remove this file from the PR entirely as changesets are managed
separately by maintainers only.

---

Nitpick comments:
In `@src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts`:
- Around line 373-380: Add a new focused test case within the test suite that
directly invokes the NativeToolCallParser.finalizeRawChunks() method to validate
its behavior. This test should verify that the guard logic in finalizeRawChunks
works correctly, since the current test validates the processFinishReason path
which uses clearRawChunkState instead. Include assertions that confirm the
expected end events are produced by finalizeRawChunks() and that raw state is
properly finalized, ensuring regression coverage for the changes made to this
method in this PR.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 583ebfd1-e91f-4526-89e4-69d1726d9c0a

📥 Commits

Reviewing files that changed from the base of the PR and between e8acc6a and 95439a0.

📒 Files selected for processing (5)
  • .changeset/fix-toolcall-dropped-leading-deltas.md
  • prs/fix-toolcall-dropped-leading-deltas.md
  • src/core/assistant-message/NativeToolCallParser.ts
  • src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts
  • src/core/prompts/tools/native-tools/ask_followup_question.ts

@awschmeder

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@edelauna edelauna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Had a couple comments on testing some edge cases.


const resultA = finalized.get("call_a")
const resultB = finalized.get("call_b")
if (resultA?.type === "tool_use") {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If finalizeStreamingToolCall returned null here (the exact failure mode this PR fixes), this guard silently skips the payload assertion and the test passes green. Could you add expect(resultA).not.toBeNull() before this if? Same for resultB on line 460.

expect(events.every((e) => e.id === "call_single")).toBe(true)

const result = finalized.get("call_single")
if (result?.type === "tool_use") {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same pattern — if finalizeStreamingToolCall returns null, the path assertion is silently skipped.

Suggested change
if (result?.type === "tool_use") {
const result = finalized.get("call_single")
expect(result).not.toBeNull()
expect(result?.type).toBe("tool_use")
if (result?.type === "tool_use") {
expect((result.nativeArgs as { path: string }).path).toBe("src/single.ts")
}

}
})

it("handles id and name arriving in separate chunks (issue #218)", () => {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also cover the reverse ordering — name in chunk 1, id in chunk 2, with buffered args in between? That path is valid under the new logic but nothing in the suite currently catches a mutation that drops tracked.id from the start-gate condition (!tracked.hasStarted && tracked.id && tracked.name at line 132 of the source).


// Update name if present in chunk and not yet set
if (name) {
tracked.name = name

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a provider sends name: "" (empty string, permitted by the type), the falsy if (name) check silently discards it and tracked.name stays "" permanently — hasStarted would never fire for that tracker. Is if (name !== undefined) closer to the intended semantics?

// Only emit an end for trackers that actually started. A tracker that
// never received an id/name (malformed stream) must not emit a phantom
// end; since start requires an id, hasStarted implies tracked.id is set.
if (tracked.hasStarted && tracked.id) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processFinishReason emits tool_call_end for all started trackers but doesn't clear rawChunkTracker. Task.ts calls finalizeRawChunks() unconditionally afterward, which iterates the same map and emits another round of ends. Today the second wave is silently dropped (the id is already deleted from streamingToolCallIndices on the first pass), but it's a latent double-fire risk if that dispatch path changes. Would adding this.rawChunkTracker.clear() at the end of the if (finishReason === "tool_calls") block make finalizeRawChunks() a safe no-op for this path?

const QUESTION_PARAMETER_DESCRIPTION = `Clear, specific question that captures the missing information you need`

const FOLLOW_UP_PARAMETER_DESCRIPTION = `Required list of 2-4 suggested responses; each suggestion must be a complete, actionable answer and may include a mode switch`
const FOLLOW_UP_PARAMETER_DESCRIPTION = `Required array of 1-4 suggested responses, always an array even for a single suggestion; each suggestion must be a complete, actionable answer and may include a mode switch`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording change (2→1 minimum, "list"→"array") looks like a follow-up to the prior minItems schema fix rather than something related to the streaming argument buffering in this PR. The PR description doesn't mention it — was this intentional to bundle here, or should it live in a separate commit so it's independently bisectable?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets remove this from diff

@github-actions github-actions Bot added the awaiting-author PR is waiting for the author to address requested changes label Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-author PR is waiting for the author to address requested changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Streaming tool-call argument deltas are silently dropped when arguments arrive before the tool-call id

2 participants