Skip to content

Port public server e2e speedups#299

Merged
IlyaasK merged 1 commit into
mainfrom
hypeship/public-e2e-speedups
Jun 26, 2026
Merged

Port public server e2e speedups#299
IlyaasK merged 1 commit into
mainfrom
hypeship/public-e2e-speedups

Conversation

@IlyaasK

@IlyaasK IlyaasK commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Port the public-safe parts of the merged private server e2e speedup/stability work into one public PR.

What changed:

  • split the public server workflow into test-server-unit and test-server-e2e
  • keep Go build caching on the unit job only and install Playwright dependencies explicitly before e2e
  • force Docker-backed e2e container teardown instead of waiting for graceful test-container stop
  • add e2e container lifecycle timing logs for start/stop/readiness profiling
  • replace fixed sleeps in enterprise extension and MV3 tests with bounded condition waits
  • make Playwright daemon recovery wait on DevTools + execution recovery instead of a fixed sleep
  • isolate cdpmonitor subtests so they do not share monitor state
  • parallelize public-safe configure, restart, and recording e2e cases
  • replace external example.com / google.com transfer setup with deterministic browser profile fixture state
  • run recording audio analysis inside the already-running test container instead of extra short-lived Docker runs

Public/private scope

This intentionally ports only changes whose target files/tests already exist in the public repo.

Skipped private-only work:

  • private persistence e2e consolidation / cleanup speedups touched server/e2e/e2e_persist_login_test.go, which does not exist in public
  • no private-only tests or private image names are added here

Why

The private test-speedup stack is merged and proved useful for separating real image failures from unrelated e2e flakes. Public currently has the same broad server e2e shape and several of the same tests, so this PR brings the public suite up to parity without changing production image/API behavior.

This follows the same testfast/startfast direction as the private work:

  • remove fixed sleeps where the test can wait for a real condition
  • keep assertions strict and failure-readable
  • reduce short-lived container churn in tests
  • separate unit and e2e workflow work without creating special one-test shards
  • keep profiling data visible in normal e2e logs

Timing

Baseline is latest successful public main server workflow before this PR: run 28205858762 on 5af656d.
After is this PR's successful server workflow: run 28245446048 on bda8701.

metric before after delta
server workflow wall 17m33s 7m29s -10m04s
server test critical job wall 11m16s single test job 4m38s test-server-e2e job -6m38s
e2e package runtime 522.015s 203.792s -318.223s
unit job availability blocked behind image builds + e2e in single job 1m52s standalone test-server-unit earlier signal
enterprise extension e2e ~90.8s per image 27.1s headless / 29.5s headful ~61s faster per image
configure powerset e2e 98.9s longest case 14.4s, suite 14.4s wall parallelized cases
restart timing e2e 118.4s 37.2s headful / 36.2s headless wall parallelized image cases
transfer fixture tests 16.9s / 17.4s / 19.6s 10.3s / 10.4s / 10.9s deterministic fixture
recording audio test 23.1s 17.5s no extra analysis containers

Image build timings are shown in validation but should not be treated as the primary test-speedup signal because Docker cache state varies by run. The most comparable signal is the e2e package runtime and the server test critical path.

Validation

Local:

  • git diff --check origin/main...HEAD
  • yq '.' .github/workflows/server-test.yaml >/dev/null
  • cd server && go test ./e2e -run 'TestExtensionDownloadObservedRequiresUpdateXMLAndCRX|TestExtensionDownloadLogSince' -count=1
  • cd server && go test ./lib/cdpmonitor -run TestBindingAndTimeline -count=1
  • cd server && go test ./e2e -run TestDoesNotExist -count=0
  • cd server && make test-unit
  • cd server/e2e/playwright && corepack pnpm install --frozen-lockfile
  • cd server/e2e/playwright && corepack pnpm exec tsx index.ts (usage/parse check)
  • ~/.agents/skills/autoreview/scripts/autoreview --mode local --prompt-file /tmp/public-e2e-speedups-review-notes.md clean

GitHub Actions on this PR:

  • run 28245446048: passed
    • build-headful / docker: 1m15s
    • build-headless / docker: 2m49s
    • test-server-unit: 1m52s
    • test-server-e2e: 4m38s
    • e2e package runtime: 203.792s
  • launcher test: passed in 21s
  • scan and Socket: passed

Notes:

  • this package does not include typescript, so there is no local tsc --noEmit gate to run

Review notes


Note

Low Risk
Test and CI workflow changes only; assertions stay strict and production runtime code is untouched. Main residual risk is heavier parallel e2e load on CI runners, which this PR already exercised successfully.

Overview
Ports public-safe server e2e speed and stability work: faster teardown, less fixed sleeping, clearer CI split, and more parallel safe tests—without changing production images or API handlers.

CI (server-test.yaml) adds workflow concurrency (cancel in-progress), splits the old combined job into test-server-unit (make test-unit, Go cache on) and test-server-e2e (make test-e2e after image builds, explicit pnpm PATH + Playwright install, Go cache off).

E2e harness forces Docker container stop with StopTimeout(0) and logs [e2e-timing] for start/stop and readiness phases.

Flake reduction replaces fixed sleeps with bounded polls: enterprise extension (policy, download logs, chrome://extensions), Playwright daemon recovery after Chromium restart (WaitDevTools + retry), and MV3 service-worker checks in Playwright (waitForFunction).

Other test changes: t.Parallel() on selected configure/restart/recording cases; zip transfer bench uses a deterministic S3 fixture instead of external sites; recording audio analysis runs ffmpeg/ffprobe inside the running container; cdpmonitor binding/timeline subtests each get an isolated monitor via withMonitor.

Reviewed by Cursor Bugbot for commit bda8701. Bugbot is set up for automated code reviews on this repo. Configure here.

@IlyaasK IlyaasK marked this pull request as ready for review June 26, 2026 14:54
@IlyaasK IlyaasK requested a review from sjmiller609 June 26, 2026 14:55

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit bda8701. Configure here.

return
case <-time.After(1 * time.Second):
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Install wait skips final poll

Medium Severity

In waitForExtensionInstalled, the loop checks remaining <= 0 before each attempt and then sleeps a full second after failures. When the deadline falls during that sleep, the next iteration exits immediately with the previous lastErr and never re-runs chromeExtensionsCheckOutput, so the test can fail even if the extension became visible during the backoff.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit bda8701. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in follow-up PR #300: #300

#299 was already merged, so this keeps the bugbot fix isolated on top of main.

@IlyaasK IlyaasK merged commit 7fc5250 into main Jun 26, 2026
11 checks passed
@IlyaasK IlyaasK deleted the hypeship/public-e2e-speedups branch June 26, 2026 14:58
@IlyaasK IlyaasK restored the hypeship/public-e2e-speedups branch June 26, 2026 14:58
IlyaasK added a commit that referenced this pull request Jun 26, 2026
## Summary
- fix the enterprise extension install wait so it does not sleep across
its deadline before failing
- add focused coverage for the retry-delay calculation
- speed up public e2e startup by restoring the Go build cache for the
e2e job and pre-pulling headful/headless images in parallel before the
Go e2e suite
- keep server e2e files, Go test files, and testdata out of Docker image
build contexts so test-only edits do not invalidate runtime image build
layers

## Timing
Before this PR on the rebased public branch:
- test-server-e2e job: 4m20s
- Run server e2e tests step: 3m50s
- Go e2e package runtime: 186.989s
- first display headless container start: 25.4s
- first display headful container start: 28.0s

After this PR:
- build-headful / docker: 50s
- build-headless / docker: 52s
- test-server-e2e job: 3m28s
- Pull e2e images step: 32s
- Run server e2e tests step: 2m29s
- Go e2e package runtime: 129.429s
- first measured container starts are now about 3-4s instead of paying
the serial image pull during the display tests

The Go e2e package runtime is now under 2.5m. The total e2e job is still
above 2.5m because GitHub Actions setup plus the explicit image pull are
counted around the Go test process.

The Docker context cleanup makes image builds more stable for test-only
PRs by preventing e2e/test-only files from invalidating runtime image
builder layers.

## Validation
- git diff --check
- yq . .github/workflows/server-test.yaml >/dev/null
- cd server && go test ./e2e -run TestNextExtensionInstallPollDelay
-count=1
- autoreview clean after Docker context cleanup
- CI green after Docker context cleanup: build-headful, build-headless,
test-server-unit, test-server-e2e, scan, chromium-launcher test, socket
checks

Follow-up for Cursor Bugbot feedback on #299.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Changes are limited to e2e test timing, GitHub Actions workflow, and
Docker ignore rules with no production runtime or auth logic touched.
> 
> **Overview**
> Fixes a **deadline race** in enterprise extension install polling:
`waitForExtensionInstalled` now uses `nextExtensionInstallPollDelay`
(shorter sleeps when little time remains, skip sleep when past deadline)
instead of a fixed 1s wait, with unit tests for the delay helper.
> 
> **CI / build speed:** the server e2e job re-enables the Go module
cache (`server/go.sum`) and **pre-pulls** headful and headless Chromium
images in parallel before `make test-e2e`.
> 
> **Docker context:** `.dockerignore` excludes `server/e2e`,
`server/**/*_test.go`, and `server/**/testdata` so test-only changes do
not invalidate runtime image build layers.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
46007d2. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants