Skip to content

feat(dynamic): streamline dynamic workflows + anchor hooks to git root#246

Merged
dean0x merged 1 commit into
mainfrom
feat/streamline-dynamic-workflows
Jun 21, 2026
Merged

feat(dynamic): streamline dynamic workflows + anchor hooks to git root#246
dean0x merged 1 commit into
mainfrom
feat/streamline-dynamic-workflows

Conversation

@dean0x

@dean0x dean0x commented Jun 21, 2026

Copy link
Copy Markdown
Owner

Context

Real /devflow:dynamic-* runs in the sibling skim / skim-search projects underperformed: ~50% of workflow runs failed and driver sessions sprawled across 1–5 days. Root-cause analysis (memory dynamic-workflow-skim-postmortem) found the time sinks were structural. This PR fixes the five highest-leverage causes plus one bundled core bug (stray nested .devflow/).

Source of truth is shared/recipes/*.mds (→ plugins/devflow-dynamic/commands/ at build) and shared/agents/*.md. Compiled recipes + distributed agents are gitignored.

Fix 1 — Long-build stall guard (the dominant failure) ⚠️ load-bearing

The Workflow runtime kills any sub-agent silent for 180s; cold cargo build/cargo test blew past it (agent stalled on all 6 attempts).

Spike first (load-bearing assumption). Ran a minimal workflow whose agent runs a >200s background command + a Monitor heartbeat poll. Result: survived: true, elapsedSeconds: 253, read final output, 8 heartbeats. The doctrine (the plan's primary path) is settled — no fallback needed.

  • New _engine.mds build_execution_doctrine() (exported, rendered in dynamic-build): build/test that may run silent >~120s MUST run via Bash(run_in_background) + a Monitor emitting sub-watchdog heartbeats (~25s) with timeout_ms above the job; prefer crate/package-scoped commands, full-workspace regression left to the human.
  • validator.md, tester.md, coder.md gain a mechanical "Long-running commands" subsection (also fixes the plain-Bash 120s default-timeout that bites /implement).

Fix 2 — Gate-1 cadence: full V-S-S exactly twice per ticket

Gate 1 (Validator→Simplifier→Scrutinizer) previously fired after every mutation (≈6 passes/ticket). Now it runs twice: post-implementation (#1) and one final post-fix gate (#2). Between review cycles and Gate-2 fixes the Coder self-verifies its own build; nothing else runs. Verified via compiled-output grep: Simplifier/Scrutinizer appear exactly each.

Fix 3 — Generated-script robustness

_preamble.mds authoring_preamble() gains a mandatory pre-flight self-check targeting the two observed crash classes (pipeline() non-array first arg; undefined is not an object (SEAL.num)), plus a cheap node --check syntax gate (noted: catches syntax only, the checklist is the real lever) and a default-dry-run recommendation for large scripts.

Fix 4 — Decisions: batch + auto-resolve transparency

  • dynamic-plan writes Auto-Resolved Decisions as decision → resolution → source (auditable/reversible) and surfaces them at the command boundary.
  • dynamic-build gains a command-boundary section: after the workflow returns, batch the wave report's escalations + any DECISIONS-NEEDED into one AskUserQuestion.
  • Preference-profile-absent note in the report.

Fix 5 — Wave deadlock report clarity

_wave.mds: each blocked/quarantined ticket now states why (the specific failed dependency / unresolved decision, never a bare "blocked") and cites the resume runId/journal. Pure reporting clarity — no control-flow change.

Fix 6 — Stray nested .devflow/ (anchor hooks to git root)

Every shell hook computed "$CWD/.devflow" with no git-root anchoring; a worker spawned with a CWD inside .devflow/docs/.../tickets/ scaffolded a stray nested .devflow/.

  • New scripts/hooks/resolve-project-root defining df_resolve_root (git top-level, with a .devflow/-strip fallback for non-git dirs), mirroring the TS CLI's getGitRoot().
  • Anchored the .devflow/ derivation in 8 hooks + ensure-devflow-init (session-cwd-dependent things like the transcript lookup deliberately stay on $CWD).
  • Anchor-only per the plan — no auto-clean / migration for existing strays.

Verification

  • Spike ✓ (survived: true, 253s).
  • npm run build clean (CLI + agent distribution + recipes, 0 errors).
  • npm test1880 passed (64 files). +10 new tests: df_resolve_root cases (a/b/c/c2), a nested-.devflow/ integration test (Stop hook with CWD inside .devflow/ writes the queue at the repo root, no stray), and compiled-dynamic-build.md assertions (build doctrine present, final Gate 1 present, per-cycle Gate-1 absent, Simplifier/Scrutinizer 2× each).

Deferred (flagged, not blocked)

  • json-helper.cjs git-root alignment (plan's optional secondary): the 4 process.cwd() sites (assign-anchor/retire-anchor/render) would need a new child_process/git dependency in a pure-JSON utility, and those ops only ever run from the Dream agent at the repo root. Left as-is per "flag, don't block."

Notes

  • During the Fix-1 spike, the spike agent flagged a skim rewrite-hook artifact: a cleanup command containing wc returned wc 1\n 0 instead of 0 (a stray wc 1 token prepended). Surfacing per the skim-reporting instruction.

Out of scope

No wave-size cap; no auto-deletion of existing strays; no change to the 180s watchdog itself (Fix 1 works with it); no Gate-2 placement change.

Fixes the five highest-leverage causes of dynamic-* run failures (~50% of
runs failed in skim/skim-search) plus a stray nested .devflow/ bug.

- Fix 1 (stall, load-bearing): new build_execution_doctrine — long
  builds/tests run via background Bash + a Monitor heartbeat poll so they
  survive the workflow's 180s inactivity watchdog (spike-verified at 253s).
  Encoded in the Validator/Tester/Coder agents and rendered in dynamic-build;
  prefer package-scoped commands, leave full-workspace regression to the human.
- Fix 2 (cadence): Gate 1 (Validator->Simplifier->Scrutinizer) now runs
  exactly TWICE per ticket (post-implementation + one final post-fix gate)
  instead of after every mutation. Between review cycles and Gate-2 fixes the
  Coder self-verifies its own build; one final Gate 1 is the build gate.
- Fix 3 (robustness): mandatory pre-flight self-check + `node --check` in the
  authoring preamble, targeting the pipeline()-non-array and undefined-field
  (SEAL.num) crash classes.
- Fix 4 (decisions): auto-resolved decisions written as decision->resolution
  ->source and surfaced for audit; mid-build escalations + decisions batched
  into ONE AskUserQuestion at the command boundary; preference-profile-absent
  note in the report.
- Fix 5 (wave clarity): escalation report names the specific blocking
  dependency/decision per ticket and cites the resume runId/journal.
- Fix 6 (stray nested .devflow/): new resolve-project-root helper anchors the
  .devflow/ path to the git root in 8 shell hooks + ensure-devflow-init,
  preventing a stray nested .devflow/ when a hook runs with a CWD inside
  .devflow/. Mirrors the TS CLI's getGitRoot().

Source-only (compiled recipes + distributed agents are gitignored, regenerated
by `npm run build`). Adds 10 tests (df_resolve_root cases + nested-.devflow
integration + compiled Gate-1-cadence assertions). 1880 tests green.
@dean0x dean0x merged commit 14c8612 into main Jun 21, 2026
2 checks passed
@dean0x dean0x deleted the feat/streamline-dynamic-workflows branch June 21, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant