perf: many-webviews stress harness + Metal dst-texture cache + leak/fd robustness#11
Merged
Conversation
…pler example/lib/stress_probe.dart mounts a grid of N animating CefWebViews (shared or ephemeral profile via --dart-define=CEF_EPHEMERAL, optional CEF_CHURN create/ dispose loop), reports Flutter frame timing, and offers +/- view controls. test/perf_sample.sh samples cef_host process count + RSS + CPU + the host-app fd count alongside it. Used to empirically verify smoothness + leak-freedom under many concurrent webviews. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CompositeMetalLocked wrapped slot->surface as a fresh MTLTexture every paint (up to 60fps per visible browser). The dest surface is stable except on resize, so cache the wrap on the Slot and recreate only when the wrapped IOSurface id changes, released wherever surface is (OnBeforeClose / create-teardown / resize), all under surface_mutex. Removes one of the two per-frame texture allocations. Verified leak-free under create/dispose churn (procs/RSS return to baseline every cycle). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… scale - handleHostDeath's reaper now SIGKILLs + reaps a child still wedged after the grace window, instead of handing the pid back for a later terminateProcess() the clean- shutdown path may never make — closing a zombie/orphan-host leak. Removes the now-unused restoreSpawnedPid. - Raise the soft RLIMIT_NOFILE toward the hard cap at plugin registration: each cef_host costs several fds, so many agent-controlled tiles could approach a GUI app's default soft limit (often 256) and fail spawns with EMFILE. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…om crispness Render many OSR webviews on one shared cef_host reliably and fast, and keep them crisp when the canvas is zoomed in. NEVER-BLANK. Creating N CEF browsers concurrently raced the first-frame GPU shared-image allocation on the one shared GPU/Viz process — losers silently Stop() -> permanent blank tile. Fix: the create-pacer now gates on first PAINT (not the bind ack) and is a sliding-window semaphore (createInFlight + an establishment window K, env FLUTTER_CEF_ESTAB_WINDOW, default 3). A patient watchdog (firstPaintGrace ~10s) reports onPaintStalled as a REPEATING signal so a consumer can do a bounded recreate; the per-slot begin-frame pump is liveness, so a blank tile is merely slow and paints when resources free. removeBrowser/hide and shutdown/host-death all free the establishment slot (no zombie-slot throttle). FASTER CASCADE. The K window overlaps establishments (~3x faster median AND last-tile first-paint for 20 real sites vs strict serial). Renderer-priority flags (--disable-renderer-backgrounding + --disable-backgrounding-occluded-windows, opt out FLUTTER_CEF_KEEP_BG_THROTTLE) keep visible OSR renderers full-priority (OSR has no OS window, so Chromium throttles them); ~halves first-paint. about:blank-first is opt-in (FLUTTER_CEF_BLANK_FIRST). CRISPNESS. dpr is now plumbed through the resize path so a canvas zoom re-renders the OSR surface at the on-screen density (was: dropped at every layer below the widget, so a zoomed-in tile upscaled a 1x texture -> blurry). CefWebView gains a renderScale prop + resizes on dpr change; CefWebSession.dpr is mutable and reallocates the IOSurface at logical*dpr; opResize carries dpr; cef_host updates slot->dpr + NotifyScreenInfoChanged. Clamped to <=8 on every layer. CLEANUP. Removed the refuted experiments (kOpRecover/DoRecover resize-recovery, the coordinated-pump A/B + its knobs, born-hidden, gpu-mem/watchdog/verbose switches, dead Slot fields). Per-slot pump + the diag counters (FLUTTER_CEF_DEBUG) are kept. TESTS. Dart: renderScale->dpr override, dpr-only change re-resizes (crispness regression), <=8 clamp. New asserting real-host probe test/run_cascade_probe.sh: N concurrent tiles all reach a first frame (never-blank). Pre-merge adversarial audit done; the create-pacer slot-accounting findings (M1/M2) are fixed here. Design notes in specs/osr-many-views.md + specs/osr-ecosystem-survey.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
# Conflicts: # packages/flutter_cef_macos/macos/Classes/CefWebSession.swift
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Performance hardening for the many-concurrent-webviews scenario, after an empirical stress audit. Key finding (verified): the engine is ALREADY smooth (0% jank) and leak-free at 12 concurrent webviews — so this PR is headroom + robustness, not a fix for observed jank. The #1 perf+leak risk is consumer wiring (Campus's ephemeral-host-per-tile), handed off separately.
Verified empirically (stress probe + sampler, added here)
Changes
example/lib/stress_probe.dart(N animating CefWebViews; shared/ephemeral/churn modes via--dart-define) +test/perf_sample.sh(cef_host procs/RSS/CPU + host-app fd sampler).MTLTextureper-Slot instead of wrapping it every frame; recreate only when the wrapped surface id changes; released at every surface site undersurface_mutex. Verified leak-free under churn. (Marginal CPU at current scale — the bottleneck is the GPUwaitUntilCompletedsync + uncoordinated pumps; see Deferred.)restoreSpawnedPid); raise softRLIMIT_NOFILEat plugin registration (avoidsEMFILEspawn failures with many agent-controlled tiles).Deferred to a follow-up (by design)
The real render bottleneck is the per-frame GPU
waitUntilCompletedsync + the N uncoordinated per-view 16 ms begin-frame pumps. The fix — a singleCVDisplayLinkdriving all visible slots in phase + batching blits onto one command buffer/tick + idle-page pump backoff — is a delicate hot-path rework. Since the engine is already smooth, it's headroom for higher counts / 120 Hz and is best done as a dedicated, heavily-measured effort.Verification
flutter analyzeclean (package + sub-packages + example); example builds + renders; stress probe smooth + leak-free; CDP filter suite unaffected.🤖 Generated with Claude Code