feat(merge): make merge asynchronous via runway#247
Merged
Conversation
This was referenced Jun 16, 2026
92d1ef3 to
f89fcb9
Compare
2ee27cf to
a970815
Compare
f89fcb9 to
2b21765
Compare
a970815 to
bc59a27
Compare
2b21765 to
cd3ef63
Compare
bc59a27 to
d9dc61f
Compare
cd3ef63 to
2d17620
Compare
d9dc61f to
7ab64d3
Compare
2d17620 to
4a525e5
Compare
7ab64d3 to
b6042a6
Compare
4a525e5 to
288e995
Compare
b6042a6 to
2a8ea85
Compare
288e995 to
1c0a803
Compare
1c0a803 to
fb39522
Compare
2a8ea85 to
6f9a08f
Compare
This was referenced Jun 17, 2026
6f9a08f to
5d3231c
Compare
fb39522 to
90a02ae
Compare
5d3231c to
eb1ba84
Compare
90a02ae to
39c599f
Compare
eb1ba84 to
eadb853
Compare
39c599f to
9f666f7
Compare
eadb853 to
e77077c
Compare
f8bc7f6 to
b337b8f
Compare
09a8f42 to
c23e491
Compare
b337b8f to
9c029ed
Compare
9c029ed to
be8fc07
Compare
c23e491 to
42b750b
Compare
mnoah1
approved these changes
Jun 18, 2026
behinddwalls
added a commit
that referenced
this pull request
Jun 18, 2026
## Summary ### Why? The message queue topic-binding proto option was named `topics`, which reads as a concrete wire topic name. It is not — it carries a stable **logical topic key**. Each implementer maps the key to whatever topic name its broker/queue requires (subject to that backend's naming constraints); on our Go side the keys are `consumer.TopicKey` values, mapped to concrete names through `TopicRegistry`. The name `topics` invited the wrong mental model. ### What? Rename the `google.protobuf.MessageOptions` extension `topics` → `topic_keys` (field number `50001` unchanged, so the wire/extension layout is identical) and reframe its doc comment to say it carries a logical key, not a wire name. Regenerated `messagequeue.pb.go` (`E_Topics` → `E_TopicKeys`). This is the base of a stack; the runway contract and the contract RFC that consume the option are updated in the branches stacked on top. ## Test Plan ✅ `make proto` — descriptor field renamed (`name=topics` → `name=topic_keys`), field number `50001` unchanged ✅ `./tool/bazel build //...` ## Issues ## Stack 1. @ #264 1. #259 1. #260 1. #245 1. #247
be8fc07 to
32c633a
Compare
42b750b to
af409b6
Compare
behinddwalls
added a commit
that referenced
this pull request
Jun 18, 2026
## Summary
### Why?
Queue payloads are Go structs serialized with `encoding/json`, so the
wire shape is defined only by Go source. There is no language-neutral
contract a non-Go client can compile against, no explicit
topic-to-payload binding, and no distinction between a domain's private
wiring and a published cross-domain contract.
### What?
Doc-only — the design of record for message queue contracts
(`doc/rfc/messagequeue-contract.md`):
- **Contract language: Protobuf**, serialized as **protobuf JSON**
(`protojson`) so payloads stay self-describing JSON on the wire. The
`.proto` is the authority and the Go binding is generated from it, so it
cannot drift (no hand-authored struct, no drift test to keep them in
sync).
- **Topic binding:** a custom `topics` proto option (defined in
`api/base/messagequeue`) carries the wire topic names on the message
itself, read back by reflection — not on the publish/consume hot path,
which still resolves topics from a `consumer.TopicKey` via the registry.
- **Location by audience:** external contracts (something outside the
owning domain depends on them) live in `api/{domain}/messagequeue/`;
internal ones in `{domain}/core/messagequeue/`, with the split enforced
by Bazel `visibility`.
- Documents the accepted protojson conventions (snake_case field names,
UPPER_SNAKE enums, int64-as-string) and why JSON Schema, binary proto,
and Avro were rejected.
## Test Plan
- ✅ `make lint` (doc-only)
## Issues
## Stack
1. #264
1. @ #259
1. #260
1. #245
1. #247
32c633a to
9789d66
Compare
af409b6 to
b596ef6
Compare
9789d66 to
f87ee0d
Compare
b596ef6 to
cbe57c4
Compare
behinddwalls
added a commit
that referenced
this pull request
Jun 18, 2026
## Summary ### Why? Runway's merge queues are the first cross-domain message queue contract: a client (potentially non-Go) publishes a merge request and consumes the result without access to Runway's Go types or storage. Per the RFC (#259) this needs a language-neutral, proto-defined contract. ### What? Establishes the message queue contract pattern using Runway's merge queues as the reference: - Adds `api/runway/messagequeue/proto/merge.proto` defining `MergeRequest`/`MergeResult`, reusing the shared `Change` and `Strategy` types and the `topics` option from `api/base`; generated into `protopb`. - Payloads are serialized as **protobuf JSON** via thin `protojson` helpers (`MergeRequestToBytes`/`MergeRequestFromBytes` and the `MergeResult` counterparts); a `Topics()` reflection helper exposes the topic binding. Topic keys are co-located with the contract. - Moves the merge bindings out of `runway/entity` into `api/runway/messagequeue`. - A drift test round-trips the payloads and asserts every Runway topic key is bound to exactly one message via the `topics` option. ## Test Plan - ✅ `./tool/bazel test //api/runway/messagequeue:messagequeue_test` - ✅ `./tool/bazel build //...` ## Issues ## Stack 1. #264 1. #259 1. @ #260 1. #245 1. #247
f87ee0d to
370d4cd
Compare
cbe57c4 to
0a7e05d
Compare
behinddwalls
added a commit
that referenced
this pull request
Jun 18, 2026
…#245) ## Summary ### Why? The merge-conflict check ran synchronously inside the `validate` consumer by calling the `mergechecker` extension inline. A real merge attempt is slow and I/O-heavy, so doing it on the partition lease blocks the pipeline and couples SubmitQueue to the checker's latency. This moves the check to an asynchronous round-trip with runway, modelled on `build`/`buildsignal` but across a service boundary. ### What? The pipeline gains `validate ⇢ (runway) ⇢ mergeconflictsignal → batch`: - `validate` drops the inline `mergechecker` call and, after its existing dedup + change-metadata + claim work, publishes the full `MergeRequest` to the runway-owned `merge-conflict-checker` queue, keyed by the request id as the client-owned correlation id. The id round-trips, so the result correlates straight back to the request (unlike `build`, whose server-generated id needs a mapping store). The hand-off to runway is retryable. - `mergeconflictsignal` (new) consumes runway's `MergeResult` off `merge-conflict-checker-signal`, advances the request to `batch` when mergeable, or fails it when conflicted. - DLQ reconcilers drive the request to `Error` on dead-letter; the signal DLQ reads the request id straight off the result. The check is triggered directly from `validate`, not a standalone `mergeconflict` trigger stage: that hop only forwarded the request id before publishing to runway, so folding it into `validate` saves a queue round-trip and removes a stage (its topic key, consumer, and DLQ entry) with no change to the cross-boundary contract. Crossing the runway boundary is why these payloads carry full data rather than entity IDs; the queue-payload-boundary rule is documented in CLAUDE.md, with the pipeline diagram and stage table updated in workflow.md and the superseded `mergechecker` validate-path row noted in extension-contract.md. The `mergechecker` package is left in-tree (unused on the validate path); removing it is a follow-up. Runway's service implementation is out of scope — only its contract (added in #260) is consumed here. ## Test Plan - ✅ `bazel build //...` - ✅ `bazel test //... --test_tag_filters=-integration,-e2e` (56 tests pass) - ✅ `make gazelle` clean ## Issues ## Stack 1. #264 1. #259 1. #260 1. @ #245 1. #247
Rework the merge stage from a synchronous in-process pusher call into a runway round-trip, mirroring the merge-conflict check. The merge controller now builds a full runway MergeRequest from the batch's member requests (one MergeStep per request, in Contains order) and publishes it to the runway-owned merge queue, keyed by the batch id as the correlation id. A new mergesignal controller consumes the MergeResult off merge-signal, transitions the batch to Succeeded/Failed, and fans out to conclude and speculate; a mergesignal DLQ reconciler fails the batch on an unprocessable result. The in-process pusher extension is retired from the orchestrator wiring (left in-tree but unused, like mergechecker); removal is a follow-up. workflow.md and extension-contract.md updated to reflect both the check and the merge crossing into runway over the shared MergeRequest/ MergeResult contract.
0a7e05d to
2fd359a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why?
The committing merge ran synchronously inside the orchestrator, blocking the partition lease on a slow, I/O-heavy operation and coupling SubmitQueue to the merger's latency. This moves the merge to an asynchronous round-trip with Runway — mirroring the merge-conflict check (#245), but for the committing merge.
What?
Adds
batch → merge ⇢ (runway) ⇢ mergesignal → conclude/speculate:merge(new) consumes a batch ready to land, builds the fullMergeRequestfrom the batch's member requests (oneMergeStepper request in Contains order, carrying each request's change and land strategy), and publishes it to Runway'smergerqueue keyed by the batch id as the client-owned correlation id.mergesignal(new) consumes Runway'sMergeResultoffmerger-signal, correlates by the echoed id, and transitions the batch toSucceeded(merged) orFailed(could not) — then fans out toconclude(so member requests pick up the outcome) andspeculate(so dependents can re-plan). Purely result-driven; no poll loop.Reuses the Runway merge contract from #260 — the same
MergeRequest/MergeResult, carried on the committingmerger/merger-signaltopics rather than the dry-run check topics. Runway's service implementation is out of scope.Test Plan
./tool/bazel build //..../tool/bazel test //... --test_tag_filters=-integration,-e2e(57 tests pass)Issues
Stack