fix: streaming tool-call argument loss in NativeToolCallParser (#695)#700
fix: streaming tool-call argument loss in NativeToolCallParser (#695)#700awschmeder wants to merge 3 commits into
Conversation
📝 WalkthroughWalkthrough
ChangesNativeToolCallParser Streaming Reassembly Fix
ask_followup_question Description Wording Update
Sequence Diagram(s)sequenceDiagram
participant Provider as Provider Stream
participant PRC as processRawChunk
participant T as rawChunkTracker[index]
participant E as ToolCallStreamEvents
participant FST as finalizeStreamingToolCall
rect rgba(255, 165, 0, 0.5)
note over Provider,T: Before fix — leading args dropped
Provider->>PRC: {index:0, arguments:"{\"q"}
PRC-->>E: (no tracker → early return, args lost)
Provider->>PRC: {index:0, id:"call_1", name:"search"}
PRC->>T: create tracker (id="call_1")
PRC->>E: tool_call_start
end
rect rgba(0, 128, 0, 0.5)
note over Provider,FST: After fix — leading args buffered
Provider->>PRC: {index:0, arguments:"{\"q"}
PRC->>T: create tracker (id=undefined), buffer args
Provider->>PRC: {index:0, id:"call_1", name:"search"}
PRC->>T: record id and name
PRC->>E: tool_call_start
PRC->>E: tool_call_delta (flush buffer: "{\"q")
Provider->>PRC: {index:0, arguments:"uery\"}"}
PRC->>E: tool_call_delta immediately
Provider->>PRC: finish_reason="tool_calls"
PRC->>E: tool_call_end (hasStarted AND id present)
PRC->>FST: finalizeStreamingToolCall("call_1")
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.changeset/fix-toolcall-dropped-leading-deltas.md (1)
1-6: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick winRemove this agent-generated changeset file from the PR.
This file conflicts with the repository’s changeset policy and should not be committed in this change.
As per coding guidelines: ".changeset/**: Do NOT create
.changesetfiles for each commit or code change. Changesets are managed separately by maintainers and should not be generated by agents during normal development."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.changeset/fix-toolcall-dropped-leading-deltas.md around lines 1 - 6, The .changeset/fix-toolcall-dropped-leading-deltas.md file was auto-generated and conflicts with the repository's changeset policy which specifies that changeset files should not be created by agents during normal development. Remove this file from the PR entirely as changesets are managed separately by maintainers only.Source: Coding guidelines
🧹 Nitpick comments (1)
src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts (1)
373-380: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd one explicit test that exercises
finalizeRawChunks()directly.The new suite validates the
processFinishReasonend path, but this helper clears raw state instead of asserting thefinalizeRawChunksguard that was also changed in this PR. A focused case for that path would harden regression coverage.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts` around lines 373 - 380, Add a new focused test case within the test suite that directly invokes the NativeToolCallParser.finalizeRawChunks() method to validate its behavior. This test should verify that the guard logic in finalizeRawChunks works correctly, since the current test validates the processFinishReason path which uses clearRawChunkState instead. Include assertions that confirm the expected end events are produced by finalizeRawChunks() and that raw state is properly finalized, ensuring regression coverage for the changes made to this method in this PR.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In @.changeset/fix-toolcall-dropped-leading-deltas.md:
- Around line 1-6: The .changeset/fix-toolcall-dropped-leading-deltas.md file
was auto-generated and conflicts with the repository's changeset policy which
specifies that changeset files should not be created by agents during normal
development. Remove this file from the PR entirely as changesets are managed
separately by maintainers only.
---
Nitpick comments:
In `@src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts`:
- Around line 373-380: Add a new focused test case within the test suite that
directly invokes the NativeToolCallParser.finalizeRawChunks() method to validate
its behavior. This test should verify that the guard logic in finalizeRawChunks
works correctly, since the current test validates the processFinishReason path
which uses clearRawChunkState instead. Include assertions that confirm the
expected end events are produced by finalizeRawChunks() and that raw state is
properly finalized, ensuring regression coverage for the changes made to this
method in this PR.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 583ebfd1-e91f-4526-89e4-69d1726d9c0a
📒 Files selected for processing (5)
.changeset/fix-toolcall-dropped-leading-deltas.mdprs/fix-toolcall-dropped-leading-deltas.mdsrc/core/assistant-message/NativeToolCallParser.tssrc/core/assistant-message/__tests__/NativeToolCallParser.spec.tssrc/core/prompts/tools/native-tools/ask_followup_question.ts
|
@coderabbitai review |
✅ Action performedReview finished.
|
edelauna
left a comment
There was a problem hiding this comment.
Thanks! Had a couple comments on testing some edge cases.
|
|
||
| const resultA = finalized.get("call_a") | ||
| const resultB = finalized.get("call_b") | ||
| if (resultA?.type === "tool_use") { |
There was a problem hiding this comment.
If finalizeStreamingToolCall returned null here (the exact failure mode this PR fixes), this guard silently skips the payload assertion and the test passes green. Could you add expect(resultA).not.toBeNull() before this if? Same for resultB on line 460.
| expect(events.every((e) => e.id === "call_single")).toBe(true) | ||
|
|
||
| const result = finalized.get("call_single") | ||
| if (result?.type === "tool_use") { |
There was a problem hiding this comment.
Same pattern — if finalizeStreamingToolCall returns null, the path assertion is silently skipped.
| if (result?.type === "tool_use") { | |
| const result = finalized.get("call_single") | |
| expect(result).not.toBeNull() | |
| expect(result?.type).toBe("tool_use") | |
| if (result?.type === "tool_use") { | |
| expect((result.nativeArgs as { path: string }).path).toBe("src/single.ts") | |
| } |
| } | ||
| }) | ||
|
|
||
| it("handles id and name arriving in separate chunks (issue #218)", () => { |
There was a problem hiding this comment.
Could we also cover the reverse ordering — name in chunk 1, id in chunk 2, with buffered args in between? That path is valid under the new logic but nothing in the suite currently catches a mutation that drops tracked.id from the start-gate condition (!tracked.hasStarted && tracked.id && tracked.name at line 132 of the source).
|
|
||
| // Update name if present in chunk and not yet set | ||
| if (name) { | ||
| tracked.name = name |
There was a problem hiding this comment.
If a provider sends name: "" (empty string, permitted by the type), the falsy if (name) check silently discards it and tracked.name stays "" permanently — hasStarted would never fire for that tracker. Is if (name !== undefined) closer to the intended semantics?
| // Only emit an end for trackers that actually started. A tracker that | ||
| // never received an id/name (malformed stream) must not emit a phantom | ||
| // end; since start requires an id, hasStarted implies tracked.id is set. | ||
| if (tracked.hasStarted && tracked.id) { |
There was a problem hiding this comment.
processFinishReason emits tool_call_end for all started trackers but doesn't clear rawChunkTracker. Task.ts calls finalizeRawChunks() unconditionally afterward, which iterates the same map and emits another round of ends. Today the second wave is silently dropped (the id is already deleted from streamingToolCallIndices on the first pass), but it's a latent double-fire risk if that dispatch path changes. Would adding this.rawChunkTracker.clear() at the end of the if (finishReason === "tool_calls") block make finalizeRawChunks() a safe no-op for this path?
| const QUESTION_PARAMETER_DESCRIPTION = `Clear, specific question that captures the missing information you need` | ||
|
|
||
| const FOLLOW_UP_PARAMETER_DESCRIPTION = `Required list of 2-4 suggested responses; each suggestion must be a complete, actionable answer and may include a mode switch` | ||
| const FOLLOW_UP_PARAMETER_DESCRIPTION = `Required array of 1-4 suggested responses, always an array even for a single suggestion; each suggestion must be a complete, actionable answer and may include a mode switch` |
There was a problem hiding this comment.
This wording change (2→1 minimum, "list"→"array") looks like a follow-up to the prior minItems schema fix rather than something related to the streaming argument buffering in this PR. The PR description doesn't mention it — was this intentional to bundle here, or should it live in a separate commit so it's independently bisectable?
There was a problem hiding this comment.
lets remove this from diff
Related GitHub Issue
Closes: #695
Description
When a provider streams a tool call whose first delta(s) arrive before the tool-call
idis known, those leading argument bytes are silently discarded byNativeToolCallParser.processRawChunk. This causes downstream "missing required parameter" and other spurious provider-dependent errors even when the model supplied the correct tool use syntax.This PR fixes the issue by centralizing the tracking of streaming tool calls in
NativeToolCallParser. TherawChunkTrackeris now initialized on the first sight of a streamindex, independent of whether anidis present. Allargumentsdeltas are buffered until bothidandnameare known, ensuring no data loss during streaming reassembly.Test Procedure
src/core/assistant-message/__tests__/NativeToolCallParser.spec.tswhich verifies that leading argument bytes arriving before theidare correctly preserved and finalized.Pre-Submission Checklist
Screenshots / Videos
N/A
Documentation Updates
Additional Notes
N/A
Get in Touch
@awschmeder
Summary by CodeRabbit
Bug Fixes
Changes