[FIX] correct dry-run counts via planned-remap chaining#20
Conversation
Dry-run under-counted every phase downstream of a fresh create: a phase that would create a parent recorded no target id, so dependent phases saw an empty remap and no-op'd (tool_instance/endpoint/pipeline/api_deployment showed 0 on a fresh target). Dry-run also counted would-creates as `skipped`, so totals didn't match the real run it's meant to predict. Fix centrally with a "reads run, writes stub" contract: - RemapTable.record_planned() mints a deterministic synthetic target id so dependent phases resolve the FK and plan-count without writing; is_planned() flags them; snapshot(hide_planned=) masks them in the report. - Every create-capable phase's dry-run branch now counts in the bucket a real run would (created/adopted) and records a planned remap. - custom_tool runs its source-side validations (frictionless check, source registry lookup) in dry-run so the plan reflects real create-vs-skip. - Phases doing live target lookups (tool_instance, workflow_endpoint, files) guard on is_planned to avoid querying a synthetic parent id. - Report shows a DRY RUN banner; synthetic ids never reach the wire. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011ja9H1rnSXmPUgQtHm8TNS
|
| Filename | Overview |
|---|---|
| src/unstract/clone/context.py | Adds record_planned() / is_planned() / snapshot(hide_planned=) to RemapTable; uuid5-based determinism is correct, _planned set properly isolated from real ids. |
| src/unstract/clone/phases/custom_tool.py | Refactors dry-run create/adopt to use record_planned; source-side validations run before the dry-run gate so frictionless-skip is accurately reflected. _record_planned_registry correctly mirrors the real-run registry-republish path as a read-only source lookup. |
| src/unstract/clone/phases/workflow_endpoint.py | Adds a planned-parent fast path to avoid querying a non-existent planned workflow's endpoints, but the fast path omits the connector-remap check that _patch_endpoint performs, causing overcounting for partial runs with excluded/failed connectors. |
| src/unstract/clone/phases/tool_instance.py | Planned-parent guard correctly short-circuits the live target lookup; dry-run adopt path now correctly counts adopted and records real remap. Logic is sound. |
| src/unstract/clone/report.py | Adds dry_run field to CloneReport; DRY RUN banner is emitted in both rich and plain renderers. as_dict() also exports the new field. |
| tests/clone/test_remap_table.py | New tests cover determinism, is_planned false-positive, and snapshot hide_planned masking. Good coverage of the new planned mechanics. |
| tests/clone/test_tool_instance_phase.py | New test_dry_run_planned_workflow_predicts_create covers the chained planned-parent scenario. All updated assertions are consistent with the new contract. |
Sequence Diagram
%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant O as Orchestrator
participant WF as WorkflowPhase
participant TI as ToolInstancePhase
participant WE as WorkflowEndpointPhase
participant RT as RemapTable
Note over O: dry_run=True, fresh target
O->>WF: run(report)
WF->>RT: record_planned("workflow", src_wf_id)
RT-->>WF: synthetic_wf_id (uuid5)
WF-->>O: "result.created += 1"
O->>TI: run(report)
TI->>RT: "snapshot() → {workflow: {src_wf_id: synthetic_wf_id}}"
TI->>RT: is_planned(synthetic_wf_id) → True
Note over TI: skip live target lookup
TI->>RT: record_planned("tool_instance", src_ti_id)
TI-->>O: "result.created += 1"
O->>WE: run(report)
WE->>RT: "snapshot() → {workflow: {src_wf_id: synthetic_wf_id}}"
WE->>RT: is_planned(synthetic_wf_id) → True
Note over WE: skip live endpoint listing
loop each src_endpoint
WE->>RT: record_planned("workflow_endpoint", src_ep_id)
WE-->>O: "result.created += 1"
end
Note over O: remap_snapshot(hide_planned=True) → synthetics shown as "(planned)"
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant O as Orchestrator
participant WF as WorkflowPhase
participant TI as ToolInstancePhase
participant WE as WorkflowEndpointPhase
participant RT as RemapTable
Note over O: dry_run=True, fresh target
O->>WF: run(report)
WF->>RT: record_planned("workflow", src_wf_id)
RT-->>WF: synthetic_wf_id (uuid5)
WF-->>O: "result.created += 1"
O->>TI: run(report)
TI->>RT: "snapshot() → {workflow: {src_wf_id: synthetic_wf_id}}"
TI->>RT: is_planned(synthetic_wf_id) → True
Note over TI: skip live target lookup
TI->>RT: record_planned("tool_instance", src_ti_id)
TI-->>O: "result.created += 1"
O->>WE: run(report)
WE->>RT: "snapshot() → {workflow: {src_wf_id: synthetic_wf_id}}"
WE->>RT: is_planned(synthetic_wf_id) → True
Note over WE: skip live endpoint listing
loop each src_endpoint
WE->>RT: record_planned("workflow_endpoint", src_ep_id)
WE-->>O: "result.created += 1"
end
Note over O: remap_snapshot(hide_planned=True) → synthetics shown as "(planned)"
Reviews (2): Last reviewed commit: "fix(clone): custom_tool sub-paths own th..." | Re-trigger Greptile
Address review: _create_fresh recorded the planned remap and _clone_one re-recorded the same value, while the adopt path only recorded in _clone_one. Drop the generic record in _clone_one; each sub-path (adopt / fresh / fresh-dry-run) now records once, since only it knows whether the target id is real or a planned synthetic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011ja9H1rnSXmPUgQtHm8TNS
What
unstract clone --dry-runso phase counts faithfully predict a real run. Adds a planned-remap mechanism so dependent phases plan-count without writing.Why
tool_instance,workflow_endpoint,pipeline,api_deploymentreported 0 against a fresh target even though a real run creates them. It only looked right when the target already had the parents (the adopt path records a remap).skipped, so totals never matched the real run a dry-run exists to predict.How
RemapTable.record_planned()mints a deterministic (uuid5) synthetic target id so dependent phases resolve the FK and plan-count;is_planned()flags them;snapshot(hide_planned=)masks them in the report.created/adopted) and records a planned remap (group,adapter,connector,tag,custom_tool,files,workflow,tool_instance,workflow_endpoint,pipeline,api_deployment).custom_toolruns its source-side validations (frictionless-adapter check, source-registry lookup → plannedprompt_studio_registryremap) in dry-run so the plan reflects real create-vs-skip.tool_instance,workflow_endpoint,files) guard onis_planned— a synthetic parent id has no row on target, so they short-circuit instead of querying it (otherwiseworkflow_endpointwould count every endpoint asfailed).DRY RUNbanner; synthetic ids never reach the wire.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
if dry_runbranch, except two safe reorders that leave the real-run sequence identical —workflow_endpointresolves the connector before the dry-run gate (real run: resolve → patch, same as before), andcustom_tool._create_freshmoves the dry-run gate after the source-side validations (real run: validate → import, same as before). New report/remap fields default off and only affect dry-run. Full suite (incl. all non-dry happy-path tests) is green.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
uuidonly).Notes on Testing
uv run pytest→ 192 passed.RemapTableplanned mechanics (determinism,is_planned,hide_plannedmasking) and planned-parent guards fortool_instance/workflow_endpoint(the end-to-end chain that previously regressed to 0).Screenshots
Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code
https://claude.ai/code/session_011ja9H1rnSXmPUgQtHm8TNS