Skip to content

Fix HTTP/2-on-HTTP/1 mislabel on resumed TLS sessions and the recv_status spin#889

Merged
benoitc merged 4 commits into
masterfrom
fix/h2-alpn-resumption-mislabel
Jun 18, 2026
Merged

Fix HTTP/2-on-HTTP/1 mislabel on resumed TLS sessions and the recv_status spin#889
benoitc merged 4 commits into
masterfrom
fix/h2-alpn-resumption-mislabel

Conversation

@benoitc

@benoitc benoitc commented Jun 18, 2026

Copy link
Copy Markdown
Owner

Fixes intermittent, reuse-only hangs against HTTPS/h2 servers where one CPU is pegged.

hackney enables TLS 1.3 session resumption for the default config. On a resumed session ssl:negotiated_protocol/1 returns protocol_not_negotiated, so the connection was labeled http1 while the server spoke h2. The HTTP/1 parser could not parse the h2 SETTINGS frame and recv_status looped forever on the unconsumable buffer.

recv_status fail-fast

recv_status now reads from the socket on {more} and fails fast with {error, {bad_response, not_http}} when the buffered bytes cannot begin an HTTP/1 status line (after tolerating leading CRLF), instead of spinning.

Carry ALPN across resumption

The protocol learned on a full handshake is remembered and reused on resumed sessions:

  • A per {Host, advertised-ALPN} memo records the protocol from a full handshake (advertised ALPN kept in offered order).
  • session_tickets is gated on that memo: resumption is offered only once the protocol is known, and the cached value is snapshotted at the gate and carried into resolution, so a concurrent eviction cannot relabel a resumed connection.
  • Resolution recalls the cached protocol only when ssl:connection_information reports the session actually resumed. A full handshake that reports no ALPN is a real http1 and refreshes the memo (no stale http2).
  • Only connections tied to the resumable ticket source update the memo, i.e. hackney's automatic tickets ({session_tickets, auto}, added by effective_opts for the default config). A custom-ssl_options handshake, including one carrying {session_tickets, disabled} or manual, uses its own reported ALPN and never writes the shared entry, so it cannot poison a later default resumed session.
  • session_tickets is excluded from the pool key so the gate does not churn buckets.

When a memo entry is evicted while OTP still holds a ticket, the next connection finds no entry, so the gate does a full handshake (no resumption), which reports ALPN and re-learns it. Eviction is never a mislabel, only an extra handshake.

The 4.4.5 changelog entry also covers the previously-merged but unreleased keep-alive pooling fix (#888).

benoitc added 4 commits June 18, 2026 11:07
…atus spin

On a resumed TLS 1.3 session ssl:negotiated_protocol/1 reports nothing, so a
reused HTTP/2 connection was labeled HTTP/1, fed h2 frames to the h1 parser, and
recv_status looped forever on an unconsumable buffer (one CPU pegged).

Carry the ALPN protocol across resumption: a per host and advertised-ALPN memo
records the protocol learned on a full handshake; session_tickets is gated on
that memo so resumption is offered only once the protocol is known, and a resumed
session resolves against the gate-time snapshot only when it actually resumed.
session_tickets is excluded from the pool key so the gate does not churn buckets.

Guard recv_status: read from the socket on {more} and fail fast with
{error, {bad_response, not_http}} when the bytes cannot start an HTTP/1 status
line, instead of spinning.
A custom-ssl_options handshake (not resumption-eligible, does not seed the
ticket store) could overwrite the shared {Host, AlpnProtos} ALPN entry, so a
later default resumed session read a poisoned snapshot and spoke the wrong
protocol.

Only handshakes hackney enabled resumption for (session_tickets in the effective
opts, i.e. the default config, the single ticket source per host) now update the
memo; non-resumable handshakes use their own reported ALPN and leave the entry
untouched.
The previous check treated any session_tickets entry as resumable, so a custom
ssl_options handshake carrying {session_tickets, disabled} or manual could still
overwrite the shared {Host, AlpnProtos} memo. Gate on {session_tickets, auto}
(what effective_opts adds for the resumable default config) via the new
hackney_ssl:auto_tickets/1, so only that ticket source updates the memo.
@benoitc benoitc merged commit 73570d4 into master Jun 18, 2026
6 checks passed
@benoitc benoitc mentioned this pull request Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant