Skip to content

Recursively visit payloads inside ScheduleNexusOperationCommandAttributes when using system nexus#290

Merged
dplyukhin merged 10 commits into
mainfrom
go-deferred-encoding
Jun 30, 2026
Merged

Recursively visit payloads inside ScheduleNexusOperationCommandAttributes when using system nexus#290
dplyukhin merged 10 commits into
mainfrom
go-deferred-encoding

Conversation

@dplyukhin

@dplyukhin dplyukhin commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What

System Nexus messages shouldn't go through the codec because they're meant to be consumed by the Temporal service itself. But payloads inside the message should go through the codec.

How

Modify the payload visitor to treat ScheduleNexusOperationCommandAttributes commands in a special way if the endpoint is __temporal_system. Specifically, we deserialize them using proto type information in the metadata, and recursively visiting the payloads inside. This is similar to what we already do for anypb.Any messages with the defaultWellKnownAnyVisitor.

⚠️ This PR only works on the inputs for system nexus operations. The output of a system nexus operation doesn't have proto type metadata attached, so we'll have to do something else there.

Tests

  • make proxy && (cd cmd/proxygenerator && go run ./ -verifyOnly) (generated code is current)
  • go test -tags protolegacy ./...
  • Add tests to proxy/system_nexus_test.go that exercise the happy and sad paths for visiting system nexus payloads

@CLAassistant

CLAassistant commented Jun 23, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

Comment thread Makefile Outdated
Comment thread proxy/system_nexus.go Outdated
Comment thread proxy/system_nexus.go Outdated
@github-actions

Copy link
Copy Markdown

Opengrep — new findings

Severity Location Rule Message
ERROR .github/workflows/check-sdk-compat.yml:31 security.gha.missing-explicit-permissions-temporal No explicit GITHUB_TOKEN permissions found at the workflow or job level. Add a permissions: block at the workflow root (applies to all jobs) or per job with least privilege (e.g., contents: read and only specific writes like pull-requests: write if needed).
ERROR .github/workflows/ci.yml:10 security.gha.missing-explicit-permissions-temporal No explicit GITHUB_TOKEN permissions found at the workflow or job level. Add a permissions: block at the workflow root (applies to all jobs) or per job with least privilege (e.g., contents: read and only specific writes like pull-requests: write if needed).
ERROR .github/workflows/create-release.yml:30 security.gha.missing-explicit-permissions-temporal No explicit GITHUB_TOKEN permissions found at the workflow or job level. Add a permissions: block at the workflow root (applies to all jobs) or per job with least privilege (e.g., contents: read and only specific writes like pull-requests: write if needed).
ERROR .github/workflows/delete-release.yml:22 security.gha.missing-explicit-permissions-temporal No explicit GITHUB_TOKEN permissions found at the workflow or job level. Add a permissions: block at the workflow root (applies to all jobs) or per job with least privilege (e.g., contents: read and only specific writes like pull-requests: write if needed).
ERROR .github/workflows/publish-release.yml:22 security.gha.missing-explicit-permissions-temporal No explicit GITHUB_TOKEN permissions found at the workflow or job level. Add a permissions: block at the workflow root (applies to all jobs) or per job with least privilege (e.g., contents: read and only specific writes like pull-requests: write if needed).
ERROR .github/workflows/update-proto.yml:22 security.gha.missing-explicit-permissions-temporal No explicit GITHUB_TOKEN permissions found at the workflow or job level. Add a permissions: block at the workflow root (applies to all jobs) or per job with least privilege (e.g., contents: read and only specific writes like pull-requests: write if needed).

Suppress findings

Add a noopengrep comment on the line before the finding:

# noopengrep: <rule-id>

When the payload visitor encounters a ScheduleNexusOperationCommandAttributes
targeting the system Nexus endpoint (__temporal_system) and the WorkflowService,
its Input is a proto-binary-serialized request whose own fields carry the user
payloads rather than an opaque single payload. Decode the envelope, visit the
inner payloads recursively (so external storage and codecs apply to the inner
payloads, not the envelope), and re-serialize -- leaving the envelope itself
un-offloaded and un-codec'd. Ordinary Nexus operations continue to visit Input
as a single opaque payload.

The (service, operation) -> input proto type mapping comes from a new,
dependency-free `nexussystem` registry package generated by nex-gen from a
checked-in WIT (nexussystem/wit), wired into the build via the
`nexus-system-registry` Make target. The envelope decode/visit/encode logic
lives in a hand-written proxy/system_nexus.go helper that the generated
interceptor calls; only binary/protobuf envelopes are accepted.
Simplify system Nexus envelope handling per review:

- Identify a system Nexus envelope solely by the reserved __temporal_system
  endpoint (all operations on that endpoint are system Nexus operations), instead
  of also matching the service name.
- Decode the envelope's proto message using the payload's "messageType"
  metadata via protoregistry.GlobalTypes, removing the need for the generated
  (service, operation) -> input-type registry. The nexussystem package, its
  checked-in WIT, and the nexus-system-registry Make target are deleted.
- Inline the system-vs-ordinary Input dispatch directly into the generated
  visitPayloads (the ScheduleNexusOperationCommandAttributes case) rather than a
  separate helper function; visitSystemNexusEnvelope remains as the envelope
  decode/visit/encode helper.

Envelopes still must be binary/protobuf; missing/unknown message types error.
@dplyukhin dplyukhin force-pushed the go-deferred-encoding branch from 392a187 to 105dddd Compare June 25, 2026 19:47
@dplyukhin dplyukhin marked this pull request as ready for review June 29, 2026 17:56
@dplyukhin dplyukhin requested review from a team as code owners June 29, 2026 17:56

@tconley1428 tconley1428 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mention that it only handles system nexus inputs currently. We can't fix that quite yet but it would be good to remember.

@dplyukhin dplyukhin changed the title Visit payloads inside system Nexus envelopes Recursively visit payloads inside ScheduleNexusOperationCommandAttributes when using system nexus Jun 29, 2026
Comment thread proxy/system_nexus.go Outdated
localConcState = &payloadConcurrencyState{sem: concState.sem}
}

if err := visitPayloads(ctx, options, msg, localConcState, msg); err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this nested concurrent path should follow the top-level VisitPayloads wait-before-return shape. At the top level we do:

proxy/interceptor.go:110

err := visitPayloads(&visitCtx, &options, nil, c, msg)
c.wg.Wait()
if err != nil {
    return err
}

but here the function returns before waiting if recursive traversal returns an error.

Since localConcState can already have spawned visitor goroutines using the shared semaphore, an early traversal error or cancellation can let those goroutines continue after VisitPayloads returns.

There is a similar nested waitgroup pattern in defaultWellKnownAnyVisitor around proxy/interceptor.go:281, but it has the same early-return shape. I would use the top-level pattern here instead: store the traversal error, always wait on localConcState.wg, then return the traversal error or localConcState.firstErr.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Fixed.

@dplyukhin dplyukhin merged commit 778d41d into main Jun 30, 2026
4 checks passed
@dplyukhin dplyukhin deleted the go-deferred-encoding branch June 30, 2026 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants