Shared-host JS channel fix + reliability hardening + regression tests#9
Merged
Conversation
…drop addChannel before browser create) A page->host JS channel (window.<name>.postMessage) was never injected on a SHARED-host session (a named profile where N tiles share one cef_host), so the page's postMessage silently no-op'd. Root cause: on a shared host a session's createBrowser is QUEUED (pendingCreates), so the addChannel op arrives with browserId=0 (sent before the session's attach()), and `case kOpAddChannel: if (!slot) break;` SILENTLY DROPPED it -> the name never entered the process-global g_channels -> OnLoadStart injected no shim -> window.<name> undefined. This is the Campus "a peer's agent_ui edit never reaches the host" bug: the peer tile rides a shared cef_host, so window.campusHost was never injected and campus.emit died before the wire (host->page still worked, so the tile rendered and received state, which masked it). Fix: don't require `slot` for kOpAddChannel. DoAddChannel now registers the name in g_channels regardless (OnLoadStart injects it on each future load) AND injects the shim into every already-loaded browser's main frame (order-independent), and is null-safe. Reproduced + verified in isolation via example/lib/channel_probe_shared.dart (two controllers on one shared cef_host): before, both sessions host=N (shim absent, postMessage dead); after, host=Y and the shim postMessage reaches each session's own handler. The single-controller case (channel_probe.dart) always passed, isolating the bug to the shared/multi-session path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… race at the root Follow-up to 5aae358. Rather than relying on cef_host accepting a browserId=0 addChannel op, the Swift CefWebSession now BUFFERS channel names and flushes them (with the real wire id) in attach() — and re-sends on a re-home. So kOpAddChannel always carries a valid browserId + slot. The cef_host side is simplified back to injecting the registering session's OWN frame (+ g_channels for the OnLoadStart re-load path), dropping the broad inject-into-all-browsers loop (the O(N*M) / cross-session-injection smell). The cef_host null-safe path stays as defense. Verified on a real shared host (FLUTTER_CEF_ALLOW_INSECURE_PROFILE=1, two controllers on one cef_host) via example/lib/channel_probe_shared.dart: host=Y on both and the shim postMessage reaches each session's own handler. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ath CDP cleanup Two confirmed shared-host correctness bugs from the pre-pin audit: - readAll/writeAll treated an EINTR-interrupted syscall as a dead pipe and tore down the WHOLE shared host (all N browsers wedge). Now retry on EINTR, matching the C++ ReadAll/WriteAll on the host side. - handleHostDeath (unexpected cef_host death) cleared the create queues but leaked live CdpRelay listeners and stranded in-flight resolveTargetId waiters (enableAgentControl callers hung forever). Now mirror shutdown()'s teardown: stop all relays + fail all pending targetId waiters with nil. Verified: example builds clean; multiview_probe (agent-control on a shared host) passes all 13 isolation/lifecycle checks. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…; never emit empty browserContextId Two confirmed shared-host CDP bugs from the pre-pin audit: - sendRawToClient ran on the SHARED CDP reader thread, and writeFrameLocked blocks up to SO_SNDTIMEO (~2s) on a stuck client — so one wedged agent client stalled CDP delivery to EVERY sibling agent-controlled tile on the same host. Now hop onto a per-relay SERIAL queue (preserves this client's frame order; isolates the wedged client); clientFd is re-validated under clientLock inside, so a concurrent stop()/handler-close is a graceful no-op. - synthesizeAttachedToTarget could emit browserContextId:"" on reconnect (when the real id was never captured for the relay), crashing Playwright's CRBrowser assertion. Now skip the synthesized event; the real attachedToTarget carries it. Verified: multiview_probe passes all 13 checks (CDP delivery + isolation intact). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A named (shared) profile is one cef_host process with one cookie jar, so sessions sharing it are NOT mutually isolated: shared cookie jar + process-global JS channels (per-message routing stays per-session). Spell out the rule — co-locate only mutually-trusting content on one profile; give distrusting content its own profile or the ephemeral default. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Dart integration_test mocks the host method channel, so it cannot catch native channel-delivery regressions — which is how the shared-host page->host channel bug shipped. Add: - a fast unit test (cef_web_controller_test.dart) for call-order independence: a channel registered before create() MUST be re-sent on create(); - test/run_channel_integration.sh: builds cef_host + the example and runs the channel / shared-host-channel / agent-control(CDP) probes against a REAL host, asserting each /tmp result. This is the layer that would have caught the B->A Campus regression. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes the shared-host (named-profile, multi-tile) path ready for the Campus pin bump. Motivated by a Campus bug where a peer's edit to an
agent_uitile never reached the owner — root cause was a page→host JS channel silently dropped on a sharedcef_host. A pre-pin readiness audit then surfaced a cluster of shared-host reliability bugs, fixed here.The channel bug + root fix
fix(cef_host)— on a shared host,kOpAddChannelarrived withbrowserId=0(createBrowser is queued, so the session hasn'tattach()ed yet) and was dropped → thewindow.<name>shim was never injected → page→host messages died.refactor(cef)— fixed at the root: the Swift session now buffersaddChanneluntilattach()and flushes with the realbrowserId(and re-sends on re-home), so the op always carries a valid session. cef_host simplified to inject the registering session's own frame.Shared-host reliability (from the audit)
readAll/writeAll— a signal mid-IO no longer looks like a dead pipe and tears down the whole host.handleHostDeathnow stops live CDP relays + fails pendingtargetIdwaiters (no leaked listener ports / hungenableAgentControl).sendRawToClientoff the shared reader thread (per-relay serial queue) — one stuck client can't starve sibling agent-controlled tiles (was up to 2s SO_SNDTIMEO per frame).browserContextId— crashed Playwright'sCRBrowserassertion on reconnect.Docs + tests
create()must be re-sent oncreate()), plustest/run_channel_integration.shwhich runs the channel / shared-host-channel / agent-control(CDP) probes against a realcef_host. The mockedintegration_testcan't catch native delivery — this is the layer that would have caught the regression.Verification
flutter testgreen (52 controller tests incl. the new one).run_channel_integration.shpasses (shared-host channel routes per-session);multiview_probepasses all 13 agent-control isolation/lifecycle checks.Scope notes
dprinopResize, CDP epoch validation, a CI assert that release builds areCEF_HOST_ADHOC=OFF.🤖 Generated with Claude Code