fix(vanilla-epoll/io_uring): lazy json-comp gz cache — drop ~135 MiB -gc none boot leak#958
Closed
enghitalo wants to merge 1 commit into
Closed
fix(vanilla-epoll/io_uring): lazy json-comp gz cache — drop ~135 MiB -gc none boot leak#958enghitalo wants to merge 1 commit into
enghitalo wants to merge 1 commit into
Conversation
…boot leak The json-comp lock-free precompute (folded into MDA2AV#956/MDA2AV#957) compressed the full (count x m) grid (~800 keys) at boot. On the `-gc none` vanilla-epoll entry, V's gzip.compress leaks ~157 KiB of scratch per call (vlang/v#27606), so the boot precompute leaked ~135 MiB of permanent RSS — CI showed vanilla-epoll baseline memory jump 77 -> 213 MiB across every profile. Restore the original LAZY, process-shared, RwMutex-guarded gz cache: compress only the ~8 (count,m) pairs the profile actually sends (~1.3 MiB) and append the cached bytes on a hit. This is the exact json-comp code CI measured at 77 MiB in the MDA2AV#956/MDA2AV#957 combo commit (pre-cleanup), plus a guard comment so the precompute is not reintroduced. The lock-free change had no measured upside. It was meant to fix the io_uring json-comp@16384 collapse (the lock-contention hypothesis, enghitalo/vanilla#89), but CI showed it did not: epoll is healthy at 16384c WITH the lock, and the io_uring collapse persisted and worsened (1.88M -> 159K rps, CPU 4312% -> 438%). That collapse is worker/thread oversubscription from the co-hosted json-tls listener (io_uring workers + TLS thread pool > cores at 16384c) — to be fixed by a per-server worker cap (enghitalo/vanilla#89 fix #1), not the gz lock. vanilla-io_uring is default-GC so it did not leak, but it is reverted too for parity and to restore its validated pre-cleanup state. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
Author
|
/benchmark -f vanilla-io_uring |
Contributor
|
👋 |
Contributor
Benchmark ResultsFramework:
Full log |
Contributor
Author
|
/benchmark -f vanilla-epoll |
Contributor
|
👋 |
Contributor
Benchmark ResultsFramework:
Full log |
Contributor
Author
|
Superseded — split into one PR per engine as requested:
Same changes, one file each. The benchmark results posted here already validated both engines (epoll 213→77 MiB, no regression; io_uring json-comp@16384 159K→402K). Closing in favor of #959 + #960. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Restore the lazy, shared, lock-guarded json-comp gz cache on
vanilla-epollandvanilla-io_uring, reverting the lock-free boot precompute that shipped in #956/#957.Why
The precompute compressed the full
(count × m)grid (~800 keys) at boot. On the-gc nonevanilla-epollbuild, V'sgzip.compressleaks ~157 KiB of internal scratch per call (upstream bug, vlang/v#27606) — so the boot precompute leaked ~135 MiB of permanent RSS. The last benchmark run showed it directly:(flat ~+135 MiB across every profile — the signature of a one-time boot allocation, reproduced exactly in isolation: 800 ×
gzip.compressunder-gc none= ~135 MiB, reclaimed under default GC.)The lazy cache compresses only the 8
(count,m)pairs the json-comp profile actually sends (/json/{1,5,10,15,25,40,50}withm∈1..8), ≈ 1.3 MiB. This PR restores the exact json-comp code CI measured at 77 MiB in the #956/#957 combo commit, plus a guard comment so the precompute isn't reintroduced.The lock-free change had no upside
It was meant to fix the io_uring
json-comp@16384collapse (a lock-contention hypothesis). The benchmark refuted it:CPU dropping at max conns = rings starving, i.e. worker/thread oversubscription from the co-hosted json-tls listener (io_uring workers + the TLS thread pool exceed cores at 16384c), not lock contention. The real fix is a per-server worker cap (enghitalo/vanilla#89 fix #1) — tracked as a follow-up, not in this PR.
vanilla-io_uringis default-GC so it didn't leak, but it's reverted too for parity and to restore its validated pre-cleanup state.Validation
vfmt-parse clean.vanilla-epollbaseline mem back to ~77 MiB; json-comp behavior unchanged from the combo (healthy epoll@16384). No/benchmarktriggered here — happy to run it on request.🤖 Generated with Claude Code