Skip to content

Abort the CLI immediately on SIGTERM/SIGINT (CI cancellation)#1132

Merged
trunk-io[bot] merged 1 commit into
mainfrom
dylan/abort-cli-on-termination-signal
Jun 30, 2026
Merged

Abort the CLI immediately on SIGTERM/SIGINT (CI cancellation)#1132
trunk-io[bot] merged 1 commit into
mainfrom
dylan/abort-cli-on-termination-signal

Conversation

@dfrankland

Copy link
Copy Markdown
Member

Problem

GitHub Actions (and most CI) cancel a job by signaling the process tree. The CLI installed no signal handlers and blocked on its work, so a cancellation couldn't fully complete until slow operations (e.g. the API retry loop or telemetry upload) finished on their own — customers reported jobs that wouldn't cancel until the slow upload completed.

Change

  • main now races the command against termination signals via tokio::select!. On SIGTERM/SIGINT (Unix) or Ctrl-C (Windows), the in-flight work future is dropped — cancelling outstanding requests and the retry loop immediately — and the process exits with the conventional 128 + signal code.
  • If handler installation fails, falls back to the OS default disposition, so behavior is no worse than before.
  • Signal handling lives in its own module (cli/src/shutdown.rs) to keep main.rs focused; exit codes are named constants (EXIT_CODE_SIGINT = 130, EXIT_CODE_SIGTERM = 143) with a comment noting these come from the POSIX 128 + signal convention, which the exitcode/sysexits crate doesn't define.
  • Adds the signal feature to the cli crate's tokio dependency (pulls signal-hook-registry into Cargo.lock).

Notes

This is the complementary fix to #1131 (retry deadline): the deadline bounds total retry time but is checked between attempts; this PR provides true immediate mid-request abort when CI actually cancels. The two are independent and can merge in any order.

Testing

🤖 Generated with Claude Code

GitHub Actions (and most CI) cancel a job by signaling the process tree.
The CLI installed no signal handlers and blocked on its work, so a
cancellation couldn't fully complete until slow operations (e.g. the API
retry loop or telemetry upload) finished on their own.

Race the command against termination signals in main via tokio::select!.
On SIGTERM/SIGINT (Unix) or Ctrl-C (Windows) the in-flight work future is
dropped, which cancels outstanding requests and the retry loop
immediately, and the process exits with the conventional 128 + signal
code. If handler installation fails we fall back to the OS default
disposition, so behavior is no worse than before.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@trunk-io

trunk-io Bot commented Jun 29, 2026

Copy link
Copy Markdown

😎 Merged directly without going through the merge queue, as the queue was empty and the PR was up to date with the target branch - details.

@codecov-commenter

codecov-commenter commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 57.14286% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.68%. Comparing base (83033bf) to head (61acc24).

Files with missing lines Patch % Lines
cli/src/shutdown.rs 47.05% 9 Missing ⚠️
cli/src/main.rs 72.72% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1132      +/-   ##
==========================================
+ Coverage   82.48%   82.68%   +0.20%     
==========================================
  Files          69       70       +1     
  Lines       15482    15505      +23     
==========================================
+ Hits        12771    12821      +50     
+ Misses       2711     2684      -27     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@trunk-staging-io

trunk-staging-io Bot commented Jun 29, 2026

Copy link
Copy Markdown

Static BadgeStatic BadgeStatic BadgeStatic Badge

Failed Test Failure Summary Logs
variant_quarantine_test should be quarantined when run with variant A test expected the sum of 2 + 2 to be 5, but it was 4, indicating a failing assertion. Logs ↗︎

View Full Report ↗︎Docs

@trunk-io

trunk-io Bot commented Jun 30, 2026

Copy link
Copy Markdown

Static BadgeStatic BadgeStatic BadgeStatic Badge

Failed Test Failure Summary Logs
variant_quarantine_test should be quarantined when run with variant A test expected the sum of 2 + 2 to be 5, but it was actually 4, indicating a failing assertion. Logs ↗︎
variant_quarantine_test should be quarantined when run with variant A test expected the sum of 2 + 2 to be 5, but it was actually 4, indicating a failing assertion. Logs ↗︎

View Full Report ↗︎Docs

@trunk-io trunk-io Bot merged commit 0714b9a into main Jun 30, 2026
29 of 30 checks passed

@TylerJang27 TylerJang27 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

trunk-io Bot pushed a commit that referenced this pull request Jul 1, 2026
The `call_api` retry loop spawns two background tasks that emit
"taking longer than expected" progress messages. They were only
aborted by explicit `.abort()` calls placed after the retry
`.await`, so on the happy path they were cleaned up correctly.

But when `call_api`'s future is cancelled mid-flight -- e.g. the
losing branch of the `tokio::select!` in the CLI main loop when a
SIGINT/SIGTERM wins the race (#1132) -- execution never reaches
those `.abort()` calls. Dropping a `JoinHandle` merely *detaches*
the task, so the reporters leaked: they kept ticking every 2s,
kept emitting "taking longer than expected" after the CLI was
interrupted, and kept a clone of the render `Sender` alive, which
could wedge the renderer join during shutdown.

Bind each spawned handle in an `AbortOnDrop` guard so the tasks
are aborted on drop -- covering both normal completion and
cancellation. Add a regression test that polls a `call_api`
future to spawn the reporters, drops it mid-flight, and asserts
the reporters stop emitting.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants