Skip to content

Normalize model rate limits into AG-UI run errors#128

Open
zoeshawwang wants to merge 5 commits into
mainfrom
fix/83566999-rate-limited-run-error
Open

Normalize model rate limits into AG-UI run errors#128
zoeshawwang wants to merge 5 commits into
mainfrom
fix/83566999-rate-limited-run-error

Conversation

@zoeshawwang

Copy link
Copy Markdown
Collaborator

Keep the change inside the SDK error-to-AG-UI boundary: detect structured 429/rate-limit signals, preserve RATE_LIMITED retry metadata on RUN_ERROR, and keep non-rate errors on their existing codes.

Constraint: Aone 83566999 requires model limit errors to end the AG-UI stream with RUN_ERROR rather than client-delivered text.

Rejected: funagent-core/front-end rewrite | downstream AG-UI frames are already pass-through and the SDK owns event semantics.

Rejected: broad generic error framework | current need is a small rate-limit normalizer.

Confidence: high

Scope-risk: narrow

Directive: Keep RUN_ERROR payload extensions whitelisted and preserve SSE event framing when adding fields.

Tested: uv run --extra server pytest tests/unittests/server/test_invoker.py tests/unittests/server/test_agui_protocol.py tests/unittests/integration/test_langgraph_events.py -q => 101 passed, 1 warning

Tested: uv run --extra server pytest tests/unittests/integration/test_langgraph_to_agent_event.py -q => 32 passed

Tested: git diff --check && uv run ruff check agentrun/server/error_utils.py agentrun/server/invoker.py agentrun/server/agui_protocol.py agentrun/integration/langgraph/agent_converter.py => passed

Tested: UltraQA dynamic AG-UI harness UQA-1..UQA-4 => passed

Change-Id: Ice926e2c21201071713ac39faeed736078fc5823
Co-developed-by: Codex noreply@openai.com
Not-tested: full repository test suite and remote CI pending

Thank you for creating a pull request to contribute to Serverless Devs agentrun-sdk-python code! Before you open the request please answer the following questions to help it be more easily integrated. Please check the boxes "[ ]" with "[x]" when done too.
Please select one of the PR types below to complete


Fix bugs

Bug detail

The specific manifestation of the bug or the associated issue.

Pull request tasks

  • Add test cases for the changes
  • Passed the CI test

Update docs

Reason for update

Why do you need to update your documentation?

Pull request tasks

  • Update Chinese documentation
  • Update English documentation

Add contributor

Contributed content

  • Code
  • Document

Content detail

if content_type == 'code' || content_type == 'document':
    please tell us `PR url`,like: https://github.com/Serverless-Devs/agentrun-sdk-python/pull/1
else:
    please describe your contribution in detail

Others

Reason for update

Why do you need to update your documentation?

@zoeshawwang zoeshawwang requested review from OhYee and Copilot and removed request for OhYee June 25, 2026 11:20

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR normalizes model rate-limit (HTTP 429 / throttling) errors into structured AG-UI RUN_ERROR events, ensuring rate-limit failures terminate the AG-UI stream with RUN_ERROR (and include retry metadata) while non-rate errors keep their existing error codes/messages.

Changes:

  • Introduces agentrun/server/error_utils.py to detect rate-limit signals and build normalized ERROR payloads (RATE_LIMITED, retryable, retryAfterMs, optional traceId).
  • Routes invoker and LangGraph conversion error handling through the shared normalizer to keep semantics consistent across boundaries.
  • Updates AG-UI encoding to whitelist/preserve specific RUN_ERROR extension fields, with added unit/integration tests covering positives and false-positives.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
agentrun/server/error_utils.py New shared helper to detect rate-limit errors and build normalized ERROR payloads.
agentrun/server/invoker.py Uses the shared helper to normalize ERROR events emitted by the invoker.
agentrun/server/agui_protocol.py Whitelists and preserves selected extra fields when encoding RUN_ERROR SSE payloads.
agentrun/integration/langgraph/agent_converter.py Reuses the shared helper to normalize LangGraph-derived error events.
tests/unittests/server/test_invoker.py Adds invoker-level tests for structured rate-limit detection and false-positive avoidance.
tests/unittests/server/test_agui_protocol.py Adds AG-UI stream tests ensuring RUN_ERROR payload shape/fields and no RUN_FINISHED on rate limits.
tests/unittests/integration/test_langgraph_to_agent_event.py Adds integration coverage for rate-limited vs non-rate LLM errors in conversion.
tests/unittests/integration/test_langgraph_events.py Mirrors integration coverage for LangGraph event conversion behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agentrun/server/error_utils.py Outdated
Comment on lines +133 to +142
def _get_header(headers: Any, name: str) -> Optional[Any]:
if isinstance(headers, dict):
for key, value in headers.items():
if str(key).lower() == name:
return value
return None
get = getattr(headers, "get", None)
if callable(get):
return get(name)
return None
zoeshawwang pushed a commit that referenced this pull request Jun 25, 2026
Normalize the requested header name before comparing so mixed-case callers still match HTTP headers case-insensitively.

Constraint: Copilot review comment on PR #128 flagged _get_header as unexpectedly case-sensitive for mixed-case lookup names.

Rejected: broader header abstraction | a one-line normalization preserves the current helper boundary.

Confidence: high

Scope-risk: narrow

Tested: uv run --extra server pytest tests/unittests/server/test_error_utils.py tests/unittests/server/test_invoker.py tests/unittests/server/test_agui_protocol.py tests/unittests/integration/test_langgraph_events.py -q => 102 passed, 1 warning

Tested: uv run --extra server pytest tests/unittests/integration/test_langgraph_to_agent_event.py -q => 32 passed

Tested: git diff --check && uv run ruff check agentrun/server/error_utils.py tests/unittests/server/test_error_utils.py => passed

Change-Id: I23e62a3b01e88c1b39f8669f89490b1c7f5e9ddf
Co-developed-by: Codex <noreply@openai.com>
Not-tested: full repository test suite and remote CI pending
Signed-off-by: congxiao.wxx <congxiao.wxx@alibaba-inc.com>
Keep the change inside the SDK error-to-AG-UI boundary: detect structured 429/rate-limit signals, preserve RATE_LIMITED retry metadata on RUN_ERROR, and keep non-rate errors on their existing codes.

Constraint: Aone 83566999 requires model limit errors to end the AG-UI stream with RUN_ERROR rather than client-delivered text.

Rejected: funagent-core/front-end rewrite | downstream AG-UI frames are already pass-through and the SDK owns event semantics.

Rejected: broad generic error framework | current need is a small rate-limit normalizer.

Confidence: high

Scope-risk: narrow

Directive: Keep RUN_ERROR payload extensions whitelisted and preserve SSE event framing when adding fields.

Tested: uv run --extra server pytest tests/unittests/server/test_invoker.py tests/unittests/server/test_agui_protocol.py tests/unittests/integration/test_langgraph_events.py -q => 101 passed, 1 warning

Tested: uv run --extra server pytest tests/unittests/integration/test_langgraph_to_agent_event.py -q => 32 passed

Tested: git diff --check && uv run ruff check agentrun/server/error_utils.py agentrun/server/invoker.py agentrun/server/agui_protocol.py agentrun/integration/langgraph/agent_converter.py => passed

Tested: UltraQA dynamic AG-UI harness UQA-1..UQA-4 => passed

Change-Id: Ice926e2c21201071713ac39faeed736078fc5823
Co-developed-by: Codex <noreply@openai.com>
Not-tested: full repository test suite and remote CI pending
Signed-off-by: congxiao.wxx <congxiao.wxx@alibaba-inc.com>
Normalize the requested header name before comparing so mixed-case callers still match HTTP headers case-insensitively.

Constraint: Copilot review comment on PR #128 flagged _get_header as unexpectedly case-sensitive for mixed-case lookup names.

Rejected: broader header abstraction | a one-line normalization preserves the current helper boundary.

Confidence: high

Scope-risk: narrow

Tested: uv run --extra server pytest tests/unittests/server/test_error_utils.py tests/unittests/server/test_invoker.py tests/unittests/server/test_agui_protocol.py tests/unittests/integration/test_langgraph_events.py -q => 102 passed, 1 warning

Tested: uv run --extra server pytest tests/unittests/integration/test_langgraph_to_agent_event.py -q => 32 passed

Tested: git diff --check && uv run ruff check agentrun/server/error_utils.py tests/unittests/server/test_error_utils.py => passed

Change-Id: I23e62a3b01e88c1b39f8669f89490b1c7f5e9ddf
Co-developed-by: Codex <noreply@openai.com>
Not-tested: full repository test suite and remote CI pending
Signed-off-by: congxiao.wxx <congxiao.wxx@alibaba-inc.com>
Close the review gaps by moving the helper to a shared utils layer, preserving AG-UI encoder framing, limiting LangGraph normalization to LLM errors, and adding false-positive plus throttling regression coverage.

Constraint: Aone 83566999 requires model rate limits to surface as stable AG-UI RUN_ERROR without leaking raw provider errors.

Rejected: Matching generic HTTP/status/code 429 text | it misclassified explanatory non-rate-limit errors.

Rejected: Hand-written RUN_ERROR SSE framing | it forked the AG-UI encoder contract.

Confidence: high

Scope-risk: narrow

Directive: Keep future rate-limit text matching limited to explicit provider throttle/rate-limit semantics.

Tested: uv run --extra server pytest tests/unittests/server/test_error_utils.py tests/unittests/server/test_invoker.py tests/unittests/server/test_agui_protocol.py tests/unittests/integration/test_langgraph_events.py tests/unittests/integration/test_langgraph_to_agent_event.py -q (143 passed, 1 warning); git diff --check; focused ruff; local uvicorn /ag-ui/agent harness for structured 429, throttling text, and false-positive code 429.

Change-Id: Ie114d646380623dc8546f082cf2b0776035ccade
Co-developed-by: Codex <noreply@openai.com>
Not-tested: GitHub CI status due local gh auth/API limitations.
Signed-off-by: congxiao.wxx <congxiao.wxx@alibaba-inc.com>
Constraint: Users need the original provider error text in AG-UI RUN_ERROR.message while still normalizing retry metadata.\nRejected: Keep fixed Chinese rate-limit copy | It hides the actionable upstream model error from clients.\nConfidence: high\nScope-risk: narrow\nDirective: Keep RATE_LIMITED code and retry fields stable, but do not replace message with generic copy.\nTested: uv run --extra server pytest tests/unittests/server/test_error_utils.py tests/unittests/server/test_invoker.py tests/unittests/server/test_agui_protocol.py tests/unittests/integration/test_langgraph_events.py tests/unittests/integration/test_langgraph_to_agent_event.py -q; uv run ruff check targeted files; git diff --check; uvicorn local harness for status 429/throttling/false-positive 429.\nNot-tested: GitHub CI not yet read back after this commit.

Change-Id: I5a1b893d34dd936bf172ca980d3e7d3a496a60d4
Co-developed-by: Codex <noreply@openai.com>
Signed-off-by: congxiao.wxx <congxiao.wxx@alibaba-inc.com>
Constraint: Aone 83566999 only needs model 429 output to terminate as AG-UI RUN_ERROR without hardcoded user-facing copy.\nRejected: fixed localized message or broad error framework | original provider error text is required and a small regex/status-code guard is enough.\nConfidence: high\nScope-risk: narrow\nDirective: Keep RUN_ERROR.message sourced from the original error text; do not introduce generic friendly copy.\nTested: uv run --extra server pytest targeted files -q; uv run ruff check targeted source/server-test files; git diff --check; uvicorn local harness for text 429, structured 429, and normal text.\nNot-tested: GitHub CI final status and DCO fix; earlier pushed commits still require sign-off remediation.
Signed-off-by: congxiao.wxx <congxiao.wxx@alibaba-inc.com>

Change-Id: I248d3f29cfd95560edc48f22d2bd014cc5762cef
Co-developed-by: Codex <noreply@openai.com>
Signed-off-by: congxiao.wxx <congxiao.wxx@alibaba-inc.com>
@zoeshawwang zoeshawwang force-pushed the fix/83566999-rate-limited-run-error branch from d3d4a6d to e666ee7 Compare June 25, 2026 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants