feat(dynamic): streamline dynamic workflows + anchor hooks to git root#246
Merged
Conversation
Fixes the five highest-leverage causes of dynamic-* run failures (~50% of runs failed in skim/skim-search) plus a stray nested .devflow/ bug. - Fix 1 (stall, load-bearing): new build_execution_doctrine — long builds/tests run via background Bash + a Monitor heartbeat poll so they survive the workflow's 180s inactivity watchdog (spike-verified at 253s). Encoded in the Validator/Tester/Coder agents and rendered in dynamic-build; prefer package-scoped commands, leave full-workspace regression to the human. - Fix 2 (cadence): Gate 1 (Validator->Simplifier->Scrutinizer) now runs exactly TWICE per ticket (post-implementation + one final post-fix gate) instead of after every mutation. Between review cycles and Gate-2 fixes the Coder self-verifies its own build; one final Gate 1 is the build gate. - Fix 3 (robustness): mandatory pre-flight self-check + `node --check` in the authoring preamble, targeting the pipeline()-non-array and undefined-field (SEAL.num) crash classes. - Fix 4 (decisions): auto-resolved decisions written as decision->resolution ->source and surfaced for audit; mid-build escalations + decisions batched into ONE AskUserQuestion at the command boundary; preference-profile-absent note in the report. - Fix 5 (wave clarity): escalation report names the specific blocking dependency/decision per ticket and cites the resume runId/journal. - Fix 6 (stray nested .devflow/): new resolve-project-root helper anchors the .devflow/ path to the git root in 8 shell hooks + ensure-devflow-init, preventing a stray nested .devflow/ when a hook runs with a CWD inside .devflow/. Mirrors the TS CLI's getGitRoot(). Source-only (compiled recipes + distributed agents are gitignored, regenerated by `npm run build`). Adds 10 tests (df_resolve_root cases + nested-.devflow integration + compiled Gate-1-cadence assertions). 1880 tests green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Real
/devflow:dynamic-*runs in the siblingskim/skim-searchprojects underperformed: ~50% of workflow runs failed and driver sessions sprawled across 1–5 days. Root-cause analysis (memorydynamic-workflow-skim-postmortem) found the time sinks were structural. This PR fixes the five highest-leverage causes plus one bundled core bug (stray nested.devflow/).Source of truth is
shared/recipes/*.mds(→plugins/devflow-dynamic/commands/at build) andshared/agents/*.md. Compiled recipes + distributed agents are gitignored.Fix 1 — Long-build stall guard (the dominant failure)⚠️ load-bearing
The Workflow runtime kills any sub-agent silent for 180s; cold
cargo build/cargo testblew past it (agent stalled on all 6 attempts).Spike first (load-bearing assumption). Ran a minimal workflow whose agent runs a >200s background command + a
Monitorheartbeat poll. Result:survived: true,elapsedSeconds: 253, read final output, 8 heartbeats. The doctrine (the plan's primary path) is settled — no fallback needed._engine.mdsbuild_execution_doctrine()(exported, rendered indynamic-build): build/test that may run silent >~120s MUST run viaBash(run_in_background)+ aMonitoremitting sub-watchdog heartbeats (~25s) withtimeout_msabove the job; prefer crate/package-scoped commands, full-workspace regression left to the human.validator.md,tester.md,coder.mdgain a mechanical "Long-running commands" subsection (also fixes the plain-Bash 120s default-timeout that bites/implement).Fix 2 — Gate-1 cadence: full V-S-S exactly twice per ticket
Gate 1 (Validator→Simplifier→Scrutinizer) previously fired after every mutation (≈6 passes/ticket). Now it runs twice: post-implementation (#1) and one final post-fix gate (#2). Between review cycles and Gate-2 fixes the Coder self-verifies its own build; nothing else runs. Verified via compiled-output grep: Simplifier/Scrutinizer appear exactly 2× each.
Fix 3 — Generated-script robustness
_preamble.mdsauthoring_preamble()gains a mandatory pre-flight self-check targeting the two observed crash classes (pipeline()non-array first arg;undefined is not an object (SEAL.num)), plus a cheapnode --checksyntax gate (noted: catches syntax only, the checklist is the real lever) and a default-dry-run recommendation for large scripts.Fix 4 — Decisions: batch + auto-resolve transparency
dynamic-planwrites Auto-Resolved Decisions asdecision → resolution → source(auditable/reversible) and surfaces them at the command boundary.dynamic-buildgains a command-boundary section: after the workflow returns, batch the wave report's escalations + anyDECISIONS-NEEDEDinto oneAskUserQuestion.Fix 5 — Wave deadlock report clarity
_wave.mds: each blocked/quarantined ticket now states why (the specific failed dependency / unresolved decision, never a bare "blocked") and cites the resumerunId/journal. Pure reporting clarity — no control-flow change.Fix 6 — Stray nested
.devflow/(anchor hooks to git root)Every shell hook computed
"$CWD/.devflow"with no git-root anchoring; a worker spawned with a CWD inside.devflow/docs/.../tickets/scaffolded a stray nested.devflow/.scripts/hooks/resolve-project-rootdefiningdf_resolve_root(git top-level, with a.devflow/-strip fallback for non-git dirs), mirroring the TS CLI'sgetGitRoot()..devflow/derivation in 8 hooks +ensure-devflow-init(session-cwd-dependent things like the transcript lookup deliberately stay on$CWD).Verification
survived: true, 253s).npm run buildclean (CLI + agent distribution + recipes, 0 errors).npm test→ 1880 passed (64 files). +10 new tests:df_resolve_rootcases (a/b/c/c2), a nested-.devflow/integration test (Stop hook with CWD inside.devflow/writes the queue at the repo root, no stray), and compiled-dynamic-build.mdassertions (build doctrine present, final Gate 1 present, per-cycle Gate-1 absent, Simplifier/Scrutinizer 2× each).Deferred (flagged, not blocked)
json-helper.cjsgit-root alignment (plan's optional secondary): the 4process.cwd()sites (assign-anchor/retire-anchor/render) would need a newchild_process/git dependency in a pure-JSON utility, and those ops only ever run from the Dream agent at the repo root. Left as-is per "flag, don't block."Notes
wcreturnedwc 1\n 0instead of0(a straywc 1token prepended). Surfacing per the skim-reporting instruction.Out of scope
No wave-size cap; no auto-deletion of existing strays; no change to the 180s watchdog itself (Fix 1 works with it); no Gate-2 placement change.