ci: cache the sccache directory across C++ test builds#765
Conversation
54357f1 to
73b91d4
Compare
|
Thanks for working on this. I was just about to file an issue to track enabling sccache on all platforms after finishing Building third-party dependencies is quite expensive, so if this works well, it can help reducing GitHub Actions resource consumption. +1 for this direction. |
| shell: bash | ||
| run: bash ci/scripts/start_minio.sh | ||
| - name: Restore sccache cache | ||
| uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5 |
There was a problem hiding this comment.
It looks like the Windows workflow was using mozilla-actions/sccache-action together with SCCACHE_GHA_ENABLED, while this PR changes it to use explicit actions/cache restore/save steps plus mozilla-actions/sccache-action.
I'm not sure which approach is better, but in case you missed it, mozilla-actions/sccache-action README documents the SCCACHE_GHA_ENABLED setup here:
https://github.com/mozilla-actions/sccache-action#cc-code
There was a problem hiding this comment.
Hey @zhjwpku, thank you for the review!
With SCCACHE_GHA_ENABLED, sccache uploads each compiled object as its own cache entry, which adds up to hundreds per build. On the larger Linux/macOS builds this hits GitHub's upload rate-limit throttle and surfaces as cache write errors. The build still passes when we hit the throttle, but the affected objects silently don't get cached. Pointing sccache at a local directory with a single actions/cache save/restore turns it into one archive per leg, so there's nothing to throttle. In my fork that brought write errors down to zero, and it was a bit faster overall since it uploads once instead of hundreds of small objects that each take a network round trip.
There was a problem hiding this comment.
Great, in that case, you may want to make the same change to the Meson Windows build as well. I believe it was missed in this PR, unless I'm overlooking.
There was a problem hiding this comment.
Good call out - I did leave it out of this PR. I was worried there might be too many changes in one PR for review as it already has a sizeable diff on the CI files. If you think it'd be fine here I can add it, or if you prefer I can raise it as a follow-up, wdyt?
What
Turn on compiler caching (sccache) for the Linux and macOS builds in
test,aws_test,sanitizer_test, andsql_catalog_test, and switch the Windowstestbuild to the same setup.mainbuilds once and saves the cache; pull requests reuse it without writing back.Why
Right now only the Windows builds reuse compiled output — every Linux and macOS build recompiles the whole bundled Arrow/Parquet/Avro/Boost stack from scratch, even though it never changes between PRs. Building it once and reusing it removes most of that repeated work. Saving the cache as a single file (instead of one upload per compiled file) also avoids the upload rate limit that causes "cache write error" spam.
Validation
On a warm pull-request run, every build reused the cache: 99.6–99.9% of files came from cache, zero write errors. The heavy builds drop from ~10–27 min to ~1.5–5 min.