Skip to content

Make ELF loading respect program header virtual addresses for non-PIE binaries#1530

Open
cshung wants to merge 8 commits into
hyperlight-dev:mainfrom
cshung:cshung/non-pie-elf-loading
Open

Make ELF loading respect program header virtual addresses for non-PIE binaries#1530
cshung wants to merge 8 commits into
hyperlight-dev:mainfrom
cshung:cshung/non-pie-elf-loading

Conversation

@cshung

@cshung cshung commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Summary

For non-PIE (ET_EXEC) ELF binaries, the guest page table now maps the code region at the ELF's declared virtual address rather than identity-mapping it at the GPA. This allows statically-linked binaries with a fixed load address (e.g., --image-base=0x200000) to execute correctly.

Problem

Previously, Hyperlight assumed code GVA == code GPA (identity mapping). Non-PIE binaries that declare a non-zero base virtual address (via program header p_vaddr) would triple-fault because the guest CPU jumped to the ELF's declared entrypoint VA, which wasn't mapped in the page tables.

Solution

  • Compute code_virt_base from the ELF's lowest LOAD segment p_vaddr
  • For non-PIE (base_va > 0): map code at the declared VA in the guest page tables
  • For PIE (base_va == 0): preserve existing identity mapping behavior (with assertion to guard the invariant)
  • Compute entrypoint as code_virt_base + (entrypoint_va - base_va)

The fix leverages the existing Mapping struct's support for phys_base != virt_base — no changes to the page table code itself.

Testing

  • New non_pie_guest_hello_world integration test exercises full guest lifecycle (init, COW, function call, return value) with a non-PIE simpleguest built at --image-base=0x200000
  • All existing PIE guest tests continue to pass (identity mapping preserved)
  • Tested on Windows/WHP and Linux/KVM

Build infrastructure

  • Added build-rust-guests-non-pie Justfile targets
  • Non-PIE build runs first in guests recipe to avoid clobbering normal guest binaries
  • Added simple_guest_non_pie_as_string() test helper

Contributes to: #1408

Copilot AI review requested due to automatic review settings June 14, 2026 22:16

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds support for running and testing non-PIE Rust guest binaries, including updating snapshot virtual-address mapping so non-PIE guests execute at their declared ELF virtual addresses.

Changes:

  • Add a helper in hyperlight_testing to locate the non-PIE simpleguest binary.
  • Update snapshot mapping/entrypoint calculation to support non-identity VA mappings for non-PIE code regions.
  • Add a new integration test and build automation (Justfile) to produce and run a non-PIE guest.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/hyperlight_testing/src/lib.rs Adds path helper(s) for locating non-PIE Rust guest binaries.
src/hyperlight_host/tests/integration_test.rs Adds an integration test that boots and calls into a non-PIE guest.
src/hyperlight_host/src/sandbox/snapshot/mod.rs Adjusts snapshot mappings and entrypoint VA calculation to support non-PIE guests.
Justfile Adds tasks to build and stage non-PIE Rust guest artifacts.

Comment thread Justfile Outdated
Comment thread src/hyperlight_host/src/sandbox/snapshot/mod.rs
Comment thread src/hyperlight_host/src/sandbox/snapshot/mod.rs Outdated
Comment thread src/hyperlight_testing/src/lib.rs
For non-PIE binaries (ET_EXEC), the page table now maps the code
region at the ELF's declared virtual address rather than identity
mapping at the GPA. The entrypoint is computed correctly as a GVA.

PIE binaries (base_va == 0) continue to use identity mapping,
preserving existing behavior.

Also adds build infrastructure and integration test for non-PIE guests:
- Justfile: build-rust-guests-non-pie target builds simpleguest with
  static relocation model and --image-base=0x200000
- hyperlight_testing: add simple_guest_non_pie_as_string() path helper
- integration_test: non_pie_guest_hello_world exercises full guest
  lifecycle (init, COW, function call) with absolute addresses

Contributes to: hyperlight-dev#1408

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>
@cshung cshung force-pushed the cshung/non-pie-elf-loading branch from 7468e06 to 40c43be Compare June 14, 2026 23:10
@cshung

cshung commented Jun 14, 2026

Copy link
Copy Markdown
Contributor Author

Addressed Copilot review feedback:

  1. Justfile (RUSTFLAGS on Linux) — Fixed. \RUSTFLAGS\ now prefixes \cargo\ directly after \cd &&\ so the env var applies to the build command.

  2. snapshot/mod.rs (virt_base mapping) — No change needed. The code region is a single contiguous block where
    gn.guest_region.start == load_addr\ — there is no offset to lose. The \�ssert_eq!\ is deliberate per codebase convention.

  3. snapshot/mod.rs (entrypoint underflow) — Fixed. Now uses \checked_sub\ with a proper error for malformed ELFs.

  4. lib.rs (doc says 'elf binary') — No change needed. Hyperlight guests are always ELF regardless of host OS.

@ludfjig ludfjig added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Jun 15, 2026
@cshung cshung force-pushed the cshung/non-pie-elf-loading branch from 40c43be to 3fa110a Compare June 15, 2026 18:49
Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>
@cshung cshung force-pushed the cshung/non-pie-elf-loading branch from 3fa110a to 35fef7b Compare June 15, 2026 19:43

@ludfjig ludfjig left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @syntactically could you have a look too

cshung added 3 commits June 16, 2026 11:02
Use plain cargo build with the existing sysroot instead of
cargo hyperlight for non-PIE guest variants. This avoids triggering
a sysroot rebuild with different RUSTFLAGS, which fails on CI runners
with stale cargo-hyperlight versions.

The non-PIE build now:
- Reuses the sysroot already created by the regular guest build
- Builds into a separate target-dir to avoid clobbering PIE artifacts
- Explicitly passes --cfg=hyperlight, entrypoint, and sysroot flags
- Runs after the regular guest build (ordering dependency)

Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>
The dep_build_guests workflow was only building the regular (PIE) Rust
guests. The non-PIE variant was never built or uploaded as an artifact,
causing the non_pie_guest_hello_world test to fail in the build-test
jobs because the binary was missing.

Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>
Cargo on newer Rust versions does not pass RUSTFLAGS during its
target-spec probe phase. This caused CI (Rust 1.94) to fail with
'could not find specification for target x86_64-hyperlight-none'.

Fix by copying target.json as x86_64-hyperlight-none.json and
setting RUST_TARGET_PATH so cargo can discover the spec without
relying on --sysroot being available during the probe.

Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>

@syntactically syntactically left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! It looks like it is moving in a good direction. I have a couple of minor suggestions/comments, as well as a bit of an alternative design that you could take (but totally up to you on that one!)

Comment thread Justfile Outdated
# to avoid clobbering the normal PIE guest artifacts.
# RUST_TARGET_PATH tells cargo where to find the x86_64-hyperlight-none.json spec.
build-rust-guests-non-pie target=default-target:
{{ if os() == "windows" { "New-Item -ItemType Directory -Path " + root + "/src/tests/rust_guests/target-non-pie -Force | Out-Null; Copy-Item " + root + "/src/tests/rust_guests/target/sysroot/lib/rustlib/x86_64-hyperlight-none/target.json " + root + "/src/tests/rust_guests/target-non-pie/x86_64-hyperlight-none.json -Force; $env:RUST_TARGET_PATH='" + root + "/src/tests/rust_guests/target-non-pie'; $env:RUSTFLAGS='--sysroot " + root + "/src/tests/rust_guests/target/sysroot --cfg=hyperlight --check-cfg=cfg(hyperlight) -C link-args=-eentrypoint -C relocation-model=static -C link-args=--no-pie -C link-args=--image-base=0x200000'; $env:RUSTC_BOOTSTRAP='1';" } else { "mkdir -p " + root + "/src/tests/rust_guests/target-non-pie && cp " + root + "/src/tests/rust_guests/target/sysroot/lib/rustlib/x86_64-hyperlight-none/target.json " + root + "/src/tests/rust_guests/target-non-pie/x86_64-hyperlight-none.json && RUST_TARGET_PATH='" + root + "/src/tests/rust_guests/target-non-pie' RUSTFLAGS='--sysroot " + root + "/src/tests/rust_guests/target/sysroot --cfg=hyperlight --check-cfg=cfg(hyperlight) -C link-args=-eentrypoint -C relocation-model=static -C link-args=--no-pie -C link-args=--image-base=0x200000' RUSTC_BOOTSTRAP=1" } }} cd src/tests/rust_guests/simpleguest && cargo build --target x86_64-hyperlight-none --target-dir ../target-non-pie --profile={{ if target == "debug" { "dev" } else { target } }}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On first glance, linking the standard library built with different codegen opts seemed a little sketchy, but I guess it should probably be OK here, at least as long as we don't have something like a GOT-generating relocation show up for some reason?

Do you need the -C link-args=--no-pie? My read of the docs / quick test on aarch64 made me think that that should be more-or-less implied by the -C relocation-model=static on the rust command that invokes the linker.

More seriously, I would prefer to avoid copying so much of the RUSTFLAGS logic from cargo-hyperlight. Is the sysroot rebuild a serious problem for CI performance? If so, maybe there is a way to address it in cargo-hyperlight by e.g. adding a flag to prevent the sysroot being rebuilt (or stripping some of these things like link-args when building the sysroot).

Also, if you do pursue the approach here, why copy the target json to a new path when the path has to be passed in via an environment variable anyway?

// For the code region, use code_virt_base as the GVA.
// For non-PIE this is the ELF's declared base VA (non-identity mapping).
// For PIE this should equal the GPA (identity mapping).
let virt_base = if rgn.region_type == MemoryRegionType::Code {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we should have some check that the executable mapping here does not conflict with any other mappings.

// For non-PIE binaries (base_va > 0), the code should appear at the
// ELF's declared virtual address. For PIE binaries (base_va == 0),
// we use the physical load address (identity mapping).
let code_virt_base = if base_va > 0 { base_va } else { load_addr };

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this choice be made based on whether the executable has .et_type = ET_DYN vs ET_EXEC, rather than the first phdr VA?

let mapping = Mapping {
phys_base: rgn.guest_region.start as u64,
virt_base: rgn.guest_region.start as u64,
virt_base,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach probably works for now, especially with executables where all the segments are contiguous. One other thing that we could do while we are already touching this code is switch to mapping the ELF a bit more "properly", by copying every section copied by a segment into the snapshot contiguously (similarly to how it is being done now, but without using phdr.p_vaddr, so that there is no risk of large gaps appearing for no reason) and then making one of these vmem::map calls for each segment (~= PT_LOAD program header). In addition to being more efficient if we encounter a file with gaps between segments (at least a few kilobytes of gaps often exist due to page-aligning the phdr VAs), this would allow us to have better default permissions in the guest page tables by using the permission information in phdr.p_flags. Probably the way to do that would be to remove this mapping of this region entirely and change the exe interface in exe.rs/implementation in elf.rs to replace load/load_at with functions that take a mapping operations and do the mapping as they walk over the phdrs. That would be my slight preference/how I would do it, but I will leave it up to you whether or not to pursue it, since it is a nontrivially-larger change.

(Unrelated, not-really-review note: the need to fix up the physical vs virtual discontinuity is continuing a trend of several things that makes me slightly question the actual utility of SandboxMemoryLayout::get_memory_regions. It might be worth thinking about whether it makes sense to get rid of that / some more bits of SandboxMemoryLayout at some point, but that's definitely not a discussion for this PR).

cshung added 3 commits June 18, 2026 08:06
In bash, 'VAR=val cd dir && cargo build' only sets VAR for cd, not
for cargo build after &&. Move the env var prefix to directly precede
the cargo build command so it is properly in scope.

Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>
…in cargo

Revert to using cargo hyperlight build with only the non-PIE link
flags (relocation-model=static, --no-pie, --image-base) appended
via RUSTFLAGS. This avoids duplicating cargo-hyperlight internals
(sysroot, --cfg=hyperlight, target spec, entrypoint).

Uses --target-dir to keep PIE and non-PIE artifacts separate.

This is a CI experiment to verify cargo hyperlight v0.1.11 handles
the RUSTFLAGS sysroot rebuild correctly. Can be reverted if CI fails.

Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>
- Use ELF e_type (ET_DYN vs ET_EXEC) to determine PIE vs non-PIE
  instead of checking if base_va is zero. This is more semantically
  correct as suggested by reviewer.
- Add a mapping conflict check that verifies the non-PIE code virtual
  address range does not overlap with any other memory region (e.g.
  PEB, stack, heap). Returns an error if a conflict is detected.
- Bump image base from 0x200000 to 0x1000000 (16MB) to avoid
  conflicts with the default memory layout regions.

Signed-off-by: cshung <3410332+cshung@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants