Numbers as Machines

A header-mostly C++20 library that models numbers as machines that emit digits on demand rather than as fixed-precision floats. A number is a tiny forkable virtual machine: you ask it for the next digit, fork it in O(1) (or O(log n) for the series tier), reproject it into a new base, compare it against another machine — all while staying interval-honest (equality is undecidable, so the library never returns a false definite answer).

The design is organized into two tiers and four implementation phases:

Automaton tier — constant- or bounded-state machines whose fork is a literal struct copy: rationals, quadratic irrationals, p-adic numbers.
Series tier — growing arbitrary-precision accumulators for classical transcendentals (e, ln 2, 1/e) backed by compiled tail-bound oracles.

Concepts

Every number is a state machine exposing a single step:

NumVMStep step(AutomatonVM)   // -> { uint32_t digit, AutomatonVM next }

Two consequences fall out of this shape:

Fork is value-copy. For the automaton tier the entire state is a 40-byte POD (AutomatonVM), so forking a number is a memcpy — O(1), no hidden state, no copy-on-write surprises. For the series tier the accumulators are arbitrary-precision integers, so fork is an explicit deep copy, advertised as O(log n) in the computation depth.
Memory is the complexity metric. A rational stays at constant state. A degree-2 algebraic number's accumulated-root register grows O(log n) in the digit index. A transcendental's accumulators grow with depth. The library instruments bit_width() so tests can assert these shapes.

Comparison and metrics are interval-honest: predicates return tri-state (Less | Greater | Indistinguishable) or std::optional (pending), never a fabricated definite answer for values that cannot be distinguished within the requested precision.

Quick start

#include "nam/number.hpp"
using namespace nam;

int main() {
 // 1/7 = 0.(142857) — emit twelve digits.
 Number r = Number::rational(1, 7, 10);
 auto ds = r.digits(12);                 // {1,4,2,8,5,7,1,4,2,8,5,7}

 // Base is a codec, not part of the value: 1/4 in base 2 is 0.01.
 Number quarter = Number::rational(1, 4, 10);
 quarter.in_base(2).digits(4);           // {0,1,0,0}

 // Fork is O(1) and exact for the automaton tier.
 auto [a, b] = r.fork();
 bool same = (a.digits(20) == b.digits(20));   // true

 // Transcendentals stream from the series tier (1/e = 0.36787944...).
 Number inv_e = Number::one_over_e(10);
 inv_e.digits(6);                        // {3,6,7,8,7,9}

 // Interval-honest comparison: 1/7 < 1/3.
 Number third = Number::rational(1, 3, 10);
 r.compare(third, 5);                    // Trit::Less
}

Build & test

# Dependencies (Debian/Ubuntu). Only build tooling is required for the core;
# the zstd/zlib/edit/curl packages are for the optional LLVM JIT backend.
sudo apt update && sudo apt install -y build-essential ninja-build cmake libzstd-dev zlib1g-dev libedit-dev libcurl4-openssl-dev

rm -rf build && cmake -S . -B build -G Ninja && cmake --build build
ctest --test-dir build --output-on-failure

The core library has zero external dependencies by default: arbitrary precision is provided by a vendored nam::BigInt, bounded integers by nam::BoundedInt, and the JIT by a function-pointer interpreter. Optional upgrades are gated behind CMake flags:

Flag	Effect	Default
`-DNAM_USE_GMP=ON`	Use GMP `mpz_t` for the series tier accumulators	off
`-DNAM_BUILD_PYTHON=ON`	Build the pybind11 user-layer bindings	off
`-DNAM_BUILD_WASM=ON`	Build the Emscripten/embind WebAssembly bindings	off
`-DNAM_SANITIZE=ON`	ASan + UBSan on the test target	on

The four phases

Phase	Theme	What it adds
1	Automaton tier	Frozen ABI, rationals, quadratic irrationals, base codecs, p-adics, comparison/metric, skip-ahead
2	Series tier	Vendored `BigInt`, transcendentals with tail-bound oracles, interval refinement, bounded LRU memo
3	User-facing API	`Number` type: operator surface, precision contexts, explicit memoization, honest comparisons
4	JIT / expression tree	`compile(expr_tree) -> NumVMFn` for runtime-built trees, sharing the static C ABI exactly

Phase 1 — automaton tier

A rational p/q in base b is a constant-state machine: the only mutable state is the running remainder, advanced by long-division-by-multiplication. Quadratic irrationals (sqrt(D)) use the classic digit-by-digit long-hand square-root recurrence — genuinely two registers (remainder + root prefix), with the root prefix's bit-width growing O(log n). p-adic rationals commit digits locally from the least-significant end, giving ultimately periodic expansions and therefore free skip-ahead.

Phase 2 — series tier

Transcendentals are convergent series carrying an attached tail-bound oracle (a compiled convergence proof). The value is tracked as an exact rational num/den plus an error numerator; a digit is committed only when the interval [num/den, (num+err)/den] is narrow enough that its leading digit is unambiguous. Refinement pulls more terms until the digit commits or a budget is exhausted (honest pending).

Phase 3 — user-facing API

The Number type is a thin tagged union over the two tiers. It hides the raw ABI behind an mpmath/Decimal-flavoured surface: digits(n), in_base(b), fork(), skip(n), compare(...), to_string(...), scoped precision_context(digits=N), and explicit .streaming() / .cached(N) memoization modes.

Phase 4 — JIT / expression tree

See JIT compilation path.

Python bindings (Phase 3 user layer)

The ergonomic user layer is exposed to Python via pybind11. It is off by default so the core still builds with no external dependencies; enable it with -DNAM_BUILD_PYTHON=ON (pybind11 is located via find_package, falling back to FetchContent):

cmake -S . -B build -G Ninja -DNAM_BUILD_PYTHON=ON
cmake --build build
PYTHONPATH=build/bindings python3 bindings/example.py
PYTHONPATH=build/bindings python3 bindings/test_nam_py.py

The honesty commitments cross the language boundary unchanged: comparison maps to Python tri-state (True | False | None), fork cost is annotated by tier (fork_cost()), memoization is an explicit mode (.streaming() / .cached(N)), and precision_context(digits=N) is a scoped context manager.

WebAssembly bindings (Phase 3 user layer, JS/WASM)

The same ergonomic user layer compiles to WebAssembly via Emscripten + embind. Because the core is header-only with zero dependencies, the WASM target is a thin binding shim — no GMP/LLVM required. It is off by default; enable it with -DNAM_BUILD_WASM=ON under the Emscripten toolchain:

# Requires the Emscripten SDK (emsdk) activated in the current shell.
emcmake cmake -S . -B build-wasm -G Ninja -DNAM_BUILD_WASM=ON
cmake --build build-wasm
node build-wasm/bindings/wasm/example.js
node build-wasm/bindings/wasm/test_nam_wasm.js

The bindings/wasm/nam.js wrapper turns the raw embind surface into an idiomatic JS API:

const createNam = require('./nam_wasm.js');
const nam = await require('./nam.js')(createNam);
nam.rational(1, 7, 10).digits(12);          // [1,4,2,8,5,7,1,4,2,8,5,7]
nam.rational(1, 4, 10).in_base(2).digits(4); // [0,1,0,0]
nam.rational(1, 7, 10).definitely_less_than(nam.rational(1, 3, 10), 5); // true
nam.precision_context(50, () => nam.e(10).digits());  // scoped, RAII

The honesty commitments cross the WASM boundary unchanged: comparison maps to JS tri-state (`-1 | 1 | null` for `compare`, `true | false | null` for `definitely_less_than`), fork cost is annotated by tier (`fork_cost()`), memoization is an explicit mode (`.streaming()` / `.cached(N)`), and `precision_context(digits, fn)` runs `fn` inside a scoped RAII precision context (restored even if `fn` throws).

Debugging a hang

ctest buffers a test's stdout and only prints it after the test finishes. If a test hangs (e.g. an infinite loop in digit generation), you see only Start 1: nam_tests and nothing else. To diagnose, bypass ctest and run the binary directly so the per-test [ RUN ] / [ OK ] lines stream live — the last [ RUN ] printed is the test that is hanging:

./build/tests/nam_tests

Useful variants:

# Stream ctest output live instead of buffering (ctest >= 3.17), and add a
# hard timeout so a hang fails fast instead of blocking CI forever.
ctest --test-dir build --output-on-failure \
   --output-junit results.xml \
   --test-output-size-passed 0 \
   --timeout 30 -V

# Force-flush per line when piping through a file or another process:
stdbuf -oL -eL ./build/tests/nam_tests

Pinpoint the hanging line with a debugger

Run under gdb, let it hang, then interrupt with Ctrl-C and inspect the stack — the top frames reveal exactly which generator/extractor is looping:

gdb ./build/tests/nam_tests
(gdb) run
# ... wait for the hang, then press Ctrl-C ...
(gdb) bt          # backtrace of the stuck thread
(gdb) frame N     # hop to the NAM frame of interest
(gdb) info locals # inspect loop counters / interval bounds

Or attach to the already-running process:

./build/tests/nam_tests &
gdb -p $(pgrep -n nam_tests)
(gdb) bt

Sanitizers and timeouts

Tests are built with ASan + UBSan by default (-DNAM_SANITIZE=ON), which catches the non-hanging failure modes (UB, OOB, leaks). A hang is not a sanitizer event, so combine sanitizers with an external watchdog:

timeout 30 ./build/tests/nam_tests || echo "hung or failed (exit $?)"

timeout exits 124 on a hang, giving CI a clean signal instead of an indefinite block.

Building a debug binary for clearer backtraces

cmake -S . -B build-debug -G Ninja -DCMAKE_BUILD_TYPE=Debug
cmake --build build-debug
gdb ./build-debug/tests/nam_tests

Header / milestone map

Milestone	Header	Status
M1 ABI	`nam/abi.h`, `nam/generator.hpp`	frozen, `static_assert`ed
M2 Rationals	`nam/rational.hpp`	constant state, period detection
M3 Quadratics	`nam/algebraic.hpp`	degree-2 sqrt recurrence, bit-width instrumented
M4 Codec	`nam/codec.hpp`	base as projection, round-trips
M5 p-adics	`nam/padic.hpp`	local commitment, valuation extractor
M6 Compare/Metric	`nam/compare.hpp`, `nam/metric.hpp`	interval-honest, product automaton
M7 Skip	`nam/skip.hpp`	periodic skip + modexp kernel
P2 BigInt	`nam/big_int.hpp`	vendored arbitrary-precision signed integer
P2 Series	`nam/series.hpp`, `nam/constants.hpp`	tail-bound oracles, explicit deep-copy fork
P2 Refine	`nam/refine.hpp`	interval-honest online digit extraction
P2 Memo	`nam/memo.hpp`	explicit bounded LRU, no hidden global state
P3 User API	`nam/number.hpp`	`Number` type, precision contexts, fork cost
P3 WASM	`bindings/wasm/*`	Emscripten/embind JS bindings, same user surface

The frozen ABI contract

The automaton tier speaks a versioned, C-compatible ABI (nam/abi.h):

typedef struct {
    uint32_t base;       /* codec selector — NOT baked into the number */
    uint32_t phase;      /* periodic-orbit phase / digit index          */
    uint64_t state[4];   /* over-provisioned for degree-4 algebraic     */
} AutomatonVM;           /* sizeof == 40                                */

typedef struct { uint32_t digit; AutomatonVM next; } NumVMStep;
typedef NumVMStep (*NumVMFn)(AutomatonVM);

sizeof(AutomatonVM) == 40 (4 + 4 + 4*8) — checked by static_assert.
Trivially copyable + standard layout: fork is a literal struct copy, O(1), no hidden state.
NAM_ABI_VERSION bumps on any layout change. Fields are never reordered.

Honesty commitments

These hold across every tier and language binding.

Core (Phase 1)

No assert(x == y) on values anywhere — equality is undecidable. Tests compare bounded digit prefixes; fork determinism is checked exact.
Comparison predicates return tri-state (Less | Greater | Indistinguishable) or std::optional<bool> (pending), never a false definite answer.
No global mutable state below the user layer.
Memory is the complexity metric: BoundedInt::bit_width() growth is asserted O(log n) for the n-th digit of a quadratic irrational.

Series tier (Phase 2)

Fork is an explicit deep copy of the BigInt accumulators — SeriesVM::fork() is O(size) in computation depth, never copy-on-write. series_fork_is_deep_copy proves the fork is unaffected by later mutation.
Every constant ships a compiled convergence proof: the tail-bound oracle is part of the SeriesSpec, so digits are only committed when the interval honestly distinguishes them.
Digit extraction is interval-honest: next_digit returns std::optional — a boundary value that cannot be distinguished yields a pending (nullopt), never a false definite digit.
Memoization is explicit and bounded: LruDigitCache / CachedDigitSource are opt-in wrappers; there is no hidden global cache.

User tier (Phase 3)

Precision contexts are scoped, not global: PrecisionContext is a thread-local RAII guard. Nesting restores the prior value on scope exit; there is no global mutable precision leaking across threads or scopes.
Fork cost is annotated by tier: Number::fork_cost() reports O(1) for the automaton tier and O(log n) for the series tier, and the fork itself dispatches to the matching copy (struct copy vs deep copy).
Comparison stays interval-honest at the user layer: definitely_less_than returns std::optional<bool> (pending) and compare returns a Trit; agrees_with is exact for a finite prefix. No false definite equality.
Memoization is an explicit mode: .streaming() (O(1) space) and .cached(max_digits=N) (bounded LRU) are user choices; forks receive independent caches so value semantics are never violated.
Base is a codec: in_base(b) reprojects without mutating the original Number (base() of the source is unchanged).

JIT tier (Phase 4)

The compiled function is a real, C-callable NumVMFn with the exact static ABI; the interpreter's dispatch table lives honestly behind the function pointer rather than corrupting any ABI field.
Both the interpreter and LLVM backends reproduce the static digit stream exactly — verified digit-for-digit in the Phase 4 tests.

Dependency: Emscripten

# Clone the emsdk
cd ~
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk

# Download and install the latest SDK
./emsdk install latest
./emsdk activate latest

# Set up environment variables in your current shell
source ./emsdk_env.sh

Then configure your CMake build pointing to the toolchain file:

clear
source ~/emsdk/emsdk_env.sh
rm -rf build build-wasm
emcmake cmake -S . -B build-wasm -G Ninja -DNAM_BUILD_WASM=ON
cmake --build build-wasm

Even easier — use the emcmake wrapper which auto-sets the toolchain:

emcmake cmake .. -DNAM_BUILD_WASM=ON
emmake make

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
bindings		bindings
build-wasm		build-wasm
docs		docs
include/nam		include/nam
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
PITCH.md		PITCH.md
README.md		README.md
THEORY.md		THEORY.md
build.sh		build.sh
dir.txt		dir.txt
pyproject.toml		pyproject.toml
technology_stack.md		technology_stack.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Numbers as Machines

Table of contents

Concepts

Quick start

Build & test

The four phases

Phase 1 — automaton tier

Phase 2 — series tier

Phase 3 — user-facing API

Phase 4 — JIT / expression tree

Python bindings (Phase 3 user layer)

WebAssembly bindings (Phase 3 user layer, JS/WASM)

Debugging a hang

Pinpoint the hanging line with a debugger

Sanitizers and timeouts

Building a debug binary for clearer backtraces

Header / milestone map

The frozen ABI contract

Honesty commitments

Dependency: Emscripten

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Numbers as Machines

Table of contents

Concepts

Quick start

Build & test

The four phases

Phase 1 — automaton tier

Phase 2 — series tier

Phase 3 — user-facing API

Phase 4 — JIT / expression tree

Python bindings (Phase 3 user layer)

WebAssembly bindings (Phase 3 user layer, JS/WASM)

Debugging a hang

Pinpoint the hanging line with a debugger

Sanitizers and timeouts

Building a debug binary for clearer backtraces

Header / milestone map

The frozen ABI contract

Honesty commitments

Dependency: Emscripten

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages