Close servlet streamable HTTP transports on async lifecycle events#1027
Close servlet streamable HTTP transports on async lifecycle events#1027lxq19991111 wants to merge 1 commit into
Conversation
|
Hi @lxq19991111 does your resolve the same issue as done in PR #726 |
|
Hi @ShemTovYosef, thanks for linking #726. This PR addresses the same production symptom, but not the same code path. #726 targets the older WebMvcSseServerTransportProvider / McpServerSession path. This PR targets the core HttpServletStreamableServerTransportProvider used by Streamable HTTP. The issue being fixed here is that Streamable HTTP servlet SSE responses are not consistently wired to servlet async lifecycle cleanup. When the client disconnects, the current GET/POST SSE transport can remain open, leaving Tomcat resources stuck in CLOSE_WAIT. This PR specifically covers:
So I agree it is related to #726 in terms of the observed CLOSE_WAIT / Tomcat resource leak symptom, but I don’t think it is a duplicate. It fixes the Streamable HTTP servlet transport path. |
|
Hi @Kehrlann can you take a look on this solution for real production issue ? |
Summary
Fixes #1021.
Register servlet async lifecycle cleanup for Streamable HTTP SSE transports created by
HttpServletStreamableServerTransportProvider.This PR makes the current HTTP/SSE transport close when the servlet async context completes, times out, or errors. It also routes SSE write failures through the transport
close()path instead of directly removing the logical MCP session from the session registry.Motivation and Context
The servlet Streamable HTTP transport creates async SSE responses for multiple request paths:
When a client disconnects, the current HTTP/SSE response can become unusable while the SDK still keeps the async context and transport state alive. In production this can leave server-side sockets in
CLOSE-WAITand keep Tomcat resources tied up until an external timeout or kernel keepalive eventually reclaims them.This issue was observed from Spring AI WebMVC usage first, but the lifecycle gap is in the MCP Java SDK core servlet transport.
Related Issues and PRs
HttpServletStreamableServerTransportProviderremoves session on non-fatal failure #952 and fix: retain streamable HTTP session on response write failure #972: a request-specific SSE write failure should not immediately remove the logical MCP session.CLOSE-WAIT/ Tomcat resource growth). fix close_wait and tomcat connection keeps growing #726 targets the olderWebMvcSseServerTransportProvider/McpServerSessionpath, while this PR targets the coreHttpServletStreamableServerTransportProviderStreamable HTTP servlet path.Changes
close()path on handling failures.Rationale
A TCP/SSE stream lifecycle is not the same as an MCP logical session lifecycle.
A servlet async error, timeout, completion, or response write failure proves that the current HTTP/SSE transport is no longer usable. It does not necessarily prove that the logical MCP session should be removed from the session registry.
This distinction matters for Streamable HTTP because request-specific POST responses can fail or be closed after the response has already been delivered. Removing the logical session on that single transport failure causes the next request with the same
mcp-session-idto fail withSession not found.This PR therefore closes the current transport and completes the associated servlet async context, while leaving logical session eviction to existing protocol-level paths such as DELETE, server shutdown, or follow-up liveness policy such as #1028.
The async listener reuses the existing stream/transport close paths. These paths are guarded and idempotent, so cleanup can be triggered consistently from servlet async completion, timeout, error, and write-failure paths.
Scope
This PR intentionally focuses on the MCP Java SDK core servlet Streamable HTTP transport:
HttpServletStreamableServerTransportProviderOut of scope:
HttpServletSseServerTransportProviderHow Has This Been Tested?
The added tests cover:
Types of changes
Checklist