feat(vcr-ra): flip stack-reload-forwarding + frame-slot DCE to default-on (#242)#516
Merged
Conversation
…t-on (#242) The two paired frame-traffic passes shipped flag-off in #514/#515 — `forward_stack_reloads` (a `local.set; local.get` reload becomes `mov rY,rX` when rX still holds the value) and `eliminate_dead_frame_stores` (the now-dead `str rX,[sp,#N]` whose slot is overwritten-before-read is removed) — go DEFAULT-ON. Escape hatch: `SYNTH_NO_STACK_FWD=1` restores the frame-resident bytes. Same gated path as the cmp→select (v0.13.0) and local-promotion (v0.14.0) flips. The win lands on the SHIPPED `--relocatable` path (the post-passes run on the direct selector's output, which is what gale ships): flight_seam 774→738, flight_seam_flat 910→878; control_step unchanged (no spurious slot reuse); signed_div_const + all RV32 unchanged (ARM-only — verified m4 and m7 identically, RISC-V byte-identical). RESULTS bit-identical, proven on every frozen anchor: control_step 0x00210A55 (control_step_differential.py 13/13), flat AND inlined flight_algo 0x07FDF307 (flight_seam_differential.py MATCH both). The execution differential now runs BOTH the default and the opt-out against wasmtime (a default flip is only safe if the shipped path AND its rollback both match) and asserts the two emit different bytes (flip engaged). Broad oracle: `cargo test --workspace` green under the new default (the wast/spec suite is compile-only, so gale's G474RE silicon is the broad-execution check). Clean-room verified (6/6 claims, independent harness). GATING: - Frozen goldens re-frozen (flight_seam, flight_seam_flat); control_step + RV32 untouched. New `frozen_fixtures_stack_fwd_escape_hatch_restores_old_bytes` asserts `SYNTH_NO_STACK_FWD=1` restores the pre-flip bytes byte-for-byte — the rollback proof and a tripwire. - CI oracle updated to test the default + the opt-out. This is an instruction/memory-op proxy win (flight_algo sp-traffic 20→7, 139→135 insns); the measured CYCLE number is gale's G474RE, confirmed post-ship per the cmp→select silicon-gate-waiver precedent. const-CSE (SYNTH_CONST_CSE) stays flag-off — its alias-eviction prerequisite is open and it is inert on flat_flight. VCR-RA / epic #242. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
avrabe
added a commit
that referenced
this pull request
Jun 26, 2026
#517) Pin sweep (workspace + all intra-workspace path-deps + MODULE.bazel 0.16.0→0.17.0) + CHANGELOG for the SYNTH_STACK_FWD flip to default-on (PR #516, b043101). The shipped --relocatable path now gets stack-reload forwarding + frame-slot dead-store elimination by default: flight_seam 774→738 B, flight_seam_flat 910→878 B; control_step + all RISC-V unchanged. RESULTS bit-identical (control_step 0x00210A55, flat+inlined flight_algo 0x07FDF307). Escape hatch SYNTH_NO_STACK_FWD=1. Measured cycle win confirmed on G474RE post-ship. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Flips the two paired frame-traffic passes from #514/#515 (
forward_stack_reloadseliminate_dead_frame_stores) from opt-in to DEFAULT-ON. Escape hatch:SYNTH_NO_STACK_FWD=1. This is the feature-loop step that turns the flag-offlevers into an actual delivered win — same gated path as cmp→select (v0.13.0) and
local-promotion (v0.14.0).
The win lands on the SHIPPED
--relocatablepath (the post-passes run on thedirect selector's output, which is what gale ships):
--relocatable, cortex-m4)Correctness — results bit-identical
0x00210A55, control_step_differential.py 13/130x07FDF307, flight_seam_differential.py MATCH bothThe execution differential now runs both the default and the
SYNTH_NO_STACK_FWDopt-out against wasmtime (a default flip is only safe if the shipped path and
its rollback both match) and asserts the two emit different bytes (flip
engaged). Broad oracle:
cargo test --workspacegreen under the new default(the wast/spec suite is compile-only, so gale's G474RE silicon is the
broad-execution check). Clean-room verified — 6/6 falsifiable claims
reproduced by an independent agent with its own harness (incl. confirming the flip
is ARM-wide m4+m7 and RISC-V byte-identical).
Soundness
Overwrite-only DCE (a
str [sp,#N]is dead only when a later store to the sameimmediate slot overwrites it with no intervening read; reaching the function end
does not count), with sub-word sp accesses (
ldrb/ldrh/…) as blockers — theadvisor-caught #483-class hole, with a test verified failing pre-fix. 8 unit tests
pin the boundaries.
Gating
untouched. New
frozen_fixtures_stack_fwd_escape_hatch_restores_old_bytesasserts
SYNTH_NO_STACK_FWD=1restores the pre-flip bytes byte-for-byte — therollback proof and a tripwire.
Honest framing
This is an instruction/memory-op proxy win (flight_algo sp-traffic 20→7,
139→135 insns). The measured cycle number is gale's G474RE, confirmed
post-ship per the cmp→select silicon-gate-waiver precedent — the merge is not
gated on silicon, the perf claim is. const-CSE stays flag-off (its
alias-eviction prerequisite is open and it is inert on flat_flight).
VCR-RA / epic #242.