feat: N=1 cooperative fiber scheduler replacing per-thread host threads#137
feat: N=1 cooperative fiber scheduler replacing per-thread host threads#137smmathews wants to merge 1 commit into
Conversation
dcd18be to
b01382c
Compare
|
leaving this in draft while I give it a more thorough review, but overall I think it should be a more robust solution than #134 |
|
still leaving in draft, I don't like how vita is broken and I can't test it. working on it. |
|
Updated and out of draft. Since the last revision:
Happy to re-merge over #140 if that lands first. |
9fe1a30 to
0208321
Compare
0208321 to
d75884d
Compare
Replace the per-guest-thread std::thread model (g_hostThreads) with an N=1
cooperative fiber scheduler: exactly one host executor thread runs all guest
EE threads as fibers, highest priority first, FIFO within a priority, with
cooperative yield points sampled every 128 back-edges. This matches the real
EE's single-core execution model that games' kernel-object usage assumes
(semaphores as ownership tokens, priority scheduling, RotateThreadReadyQueue
rotation) and removes the cross-thread races and lock-order deadlocks of the
previous model.
Blocking syscalls park through a publish -> arm_park -> block_current
protocol whose wake_pending handshake cannot lose wakeups; all wait sites
re-check their predicates in Mesa-style loops. Host workers (IRQ, vsync,
alarm, RPC) either wake fibers through gated entry points (with
{generation,tid} tokens rejecting stale wakeups after tid reuse) or borrow
the guest token via AsyncGuestScope. Host threads carrying a guest tid get
full ThreadInfo wait bookkeeping (wait lists, ReleaseWaitThread targeting)
but never park on the fiber scheduler; they poll with bounded backoff.
Four fiber backends: ucontext (POSIX default), Win32 Fibers (_WIN32),
SceFiber (PLATFORM_VITA; compiles against a real VitaSDK, untested on
hardware, no guard page - documented), and pthread (PS2X_FIBER_PTHREAD=ON,
exists so ThreadSanitizer can instrument the scheduler). PS2X_SANITIZE wires
TSan/ASan through every target from the top-level CMakeLists.
Merges upstream main through ran-j#143 and preserves its guest-visible semantics
(ran-j#136 sid-on-success semaphore returns throughout, including older tests).
Bugs found and fixed while hardening: lost ReleaseWaitThread during Mesa
retries; alarm-worker use-after-free vs CancelAlarm; non-fiber WaitEventFlag
and WaitForNextVSyncTick returning without waiting; INTC/DMAC enable-mask
races; a terminate or resume wake dropped in the publish->park window; a
tid-recycling ABA in join_fiber; main-thread guest identity (tid 1)
restored to upstream semantics.
Tests: 351/351 across ucontext and pthread backends (deterministic across
repeated runs), TSan-clean except two documented intentional reports on the
guest-visible vsync words plus one pre-existing upstream report
(updateGsCsrFieldForVSync), ASan/LSan clean. Scheduler suites cover the
wakeup-window protocol, suspend/resume gating, tid reuse, terminate windows,
priority rotation and join floors, shutdown ordering, and borrowed workers,
with regression tests that fail against the unfixed code.
d75884d to
93be65c
Compare
|
Re-merged after #140/#141 landed — branch is again a single commit (93be65c) on current main, conflict-free, all CI green. Notes on the resolution:
Local validation: 354/354 (incl. the new dispatchGuestBranch tests) on ucontext and pthread backends, TSan and ASan/LSan clean apart from the documented intentional MMIO-polling reports. One observation from the merge, independent of this PR: #140 made |
feat: N=1 cooperative fiber scheduler replacing per-thread host threads
What this replaces and why
Upstream runs each guest EE thread on its own host
std::thread(g_hostThreadsinHelpers/State.h, spawned fromThread.cpp/ps2_runtime.cpp). That model allows two guest threads to execute truly concurrently, which the real EE (a single core) never does — games' kernel-object usage (semaphores as pure ownership tokens, priority-based cooperative scheduling,RotateThreadReadyQueueround-robin) assumes exactly one running thread. Under the std::thread model this shows up as (DQ8 examples observed while debugging): audio-thread starvation, missed wakeups betweenWaitSema/SignalSemapairs, and lock-order deadlocks between the run token and guest mutexes (earlier attempt: #134).This PR replaces that model with an N=1 cooperative fiber scheduler:
RotateThreadReadyQueuerotates the equal-priority group).ps2_fiber.{h,cpp}) with per-fiberR5900Contextstate; blocking syscalls park the fiber via an arm/publish/park protocol that cannot lose wakeups (see below).AsyncGuestScope) so handler code is serialized against fibers.yield_point()every 128 back-edges (terminate/suspend/priority checks); blocking syscalls yield cooperatively.Wakeup protocol (the core correctness argument)
A blocking syscall does: publish to the object wait-list under the object mutex →
arm_park()(marks Blocked under the scheduler mutex) → release object mutex →block_current(). A waker that fires in the publish/arm window finds the fiber still running and recordswake_pendinginstead of enqueueing (wake_locked);block_current()consumes it and returnsWokenInWindowwithout parking; all wait sites re-check their predicate in a Mesa-style loop. Wakeups gate onsuspendCount(a suspended waiter staysTHS_WAITSUSPENDuntilResumeThread). Stale wakeups after tid reuse are rejected by{generation, tid}tokens (enqueue_external_wakeup_validated). Host threads that call blocking syscalls (no fiber) use bounded-backoff Mesa loops on the same predicates.Backends
PROT_NONEguard page_WIN32FIBER_FLAG_FLOAT_SWITCHPLATFORM_VITAsceKernelAllocMemBlock(no guard page — Vita has no per-subrange protection API; documented)_sceFiberInitializeImpl,<psp2/fiber.h>); untested on hardware-DPS2X_FIBER_PTHREAD=ONpthread_attr_setstackswapcontext); full suite green-DPS2X_SANITIZE=thread|address(top-level) wires sanitizers through every target; TSan requires the pthread backend (enforced at configure time).Testing
updateGsCsrFieldForVSync(untouched by this PR; fixed separately in fix(runtime): make GS CSR atomic to fix vsync-worker/guest data race #145).ReleaseWaitThreadduring Mesa retries, alarm-worker use-after-free vsCancelAlarm, non-fiberWaitEventFlag/WaitForNextVSyncTickreturning without waiting, INTC/DMAC enable-mask races, a terminate/suspend wake dropped in the publish→park window, and a tid-recycling ABA injoin_fiber.Guest-visible semantics notes (for review against ps2sdk)
GetThreadIdfrom the primary host thread returns 1 (upstream reserved tid 1 for main; the runtime claims it at construction).SetEventFlag/SignalSemaarriving whilesuspendCount > 0leaves the threadTHS_WAITSUSPEND; it completes the wait afterResumeThread. (Divergence from literal EE kernel timing, where the wake transitions WAITSUSPEND→SUSPEND immediately; the end state after resume is identical. Called out in the adapted kernel test.)Exitingstate so a guest exit handler can make blocking syscalls without the scheduler freeing the live trampoline frame.Vita status (honest labeling)
The SceFiber backend is compile-verified against the real VitaSDK toolchain (reproducible container harness; zero warnings) but has not run on hardware — I don't have a Vita to test with. The previous "vita is broken" state is resolved at the compile level:
<psp2/fiber.h>,_sceFiberInitializeImpl(vita-headers exposes nosceFiberInitializewrapper), andsceKernelAllocMemBlockstacks (VitaSDK has no<sys/mman.h>). If someone with hardware can boot-test, I'll support fixes.Merge/rebase note
This branch is a single commit on top of current main (through #143), so it applies cleanly. If #140 lands first I'm happy to rebase over it.