LightMem2

A modular framework for long-running agent memory and context management

LightMem2 is a lightweight runtime framework for long-running LLM agents. It reduces context growth and serving cost in real shared-session workloads.

95.7% fewer input tokens | 87.0% lower cost
vs. Vanilla OpenClaw on Claw-Eval continuous mode

67.4% fewer input tokens | 61.5% lower cost
vs. Vanilla OpenClaw on PinchBench continuous mode

📢 News

[2026-06-16]: 🚀 TokenPilot: Cache-Efficient Context Management for LLM Agents is released.

🔧 Installation

Installation Steps

Clone the repository and build the shared packages first:

git clone https://github.com/zjunlp/LightMem2.git
cd LightMem2
corepack enable
pnpm install
pnpm build
pnpm lightmem2:build
pnpm lightmem2:install

OpenClaw

If your OpenClaw home or config path is not under the default ~/.openclaw, you can override it with:

export LIGHTMEM2_OPENCLAW_HOME="/path/to/openclaw-home"
export OPENCLAW_CONFIG_PATH="/path/to/openclaw.json"

Then just run the OpenClaw adapter install command:

pnpm component:install:tokenpilot:openclaw

The OpenClaw installer will:

package the current TokenPilot OpenClaw adapter
install it into ~/.openclaw/extensions/tokenpilot
update ~/.openclaw/openclaw.json
enable the TokenPilot plugin entry
switch plugins.slots.contextEngine to layered-context
apply the default normal runtime mode
try to restart the OpenClaw gateway automatically

Codex CLI

If your Codex config files are not under the default ~/.codex, you can override them with:

export CODEX_CONFIG_PATH="/path/to/config.toml"
export CODEX_HOOKS_CONFIG_PATH="/path/to/hooks.json"
export TOKENPILOT_CODEX_CONFIG="/path/to/tokenpilot.json"

Then build and install the Codex adapter:

npm --prefix components/tokenpilot/adapters/codex run build
npm --prefix components/tokenpilot/adapters/codex run install:codex

The Codex installer will:

add a TokenPilot model provider entry to Codex config
switch the default model_provider to that local TokenPilot provider
write TokenPilot runtime config to ~/.codex/tokenpilot.json
register Codex hooks in ~/.codex/hooks.json
configure the local proxy base URL served by tokenpilot-codex

The standalone CLI entrypoint is built from:

components/tokenpilot/products/cli/dist/cli.js

The install step creates:

~/.local/bin/lightmem2

⚡ Quick Start

1. Use the Component Namespace

When the current OpenClaw adapter is active, OpenClaw will expose models under:

lightmem2/<model>

For example:

lightmem2/gpt-5.4-mini

For the current LightMem2 runtime path, use a lightmem2/... model instead of your original provider model.

2. Verify It in a Real Session

OpenClaw

Start or restart OpenClaw.
Open a session with a lightmem2/<model> model.
Run:

/lightmem2 status

You should see a status block similar to:

plugin entry enabled
config enabled
mode normal
context engine slot layered-context
stabilizer enabled
reduction enabled

For a fuller runtime summary, run:

/lightmem2 report
/lightmem2 doctor
/lightmem2 visual
/lightmem2 mode normal

/lightmem2 doctor is the quickest integration self-check for the current OpenClaw adapter surface. /lightmem2 visual opens the local visual inspector for stability, reduction, and eviction snapshots. /lightmem2 mode <conservative|normal|aggressive> switches preset runtime behavior.

If your current host does not expose internal slash commands, use the standalone CLI:

lightmem2 openclaw status
lightmem2 openclaw report
lightmem2 openclaw doctor
lightmem2 openclaw visual
lightmem2 openclaw mode normal

Codex CLI

The current Codex path uses the standalone CLI plus Codex hooks.

Run the Codex install flow shown above.
Start Codex normally.
In another terminal, verify the adapter:

lightmem2 codex status
lightmem2 codex doctor
lightmem2 codex mode normal
lightmem2 codex reduction status
lightmem2 codex stabilizer target user

For daemon-level inspection, you can also run:

tokenpilot-codex status
tokenpilot-codex start

Codex currently supports mode conservative and mode normal. mode aggressive is not available on the current Codex adapter.

For the full Codex adapter notes, see:

components/tokenpilot/adapters/codex/README.md

3. Run the Built-In Smoke Test

bash docs/scripts/smoke_isolated_gateway.sh

Before running it, set your upstream provider info:

export LIGHTMEM2_API_KEY="your_api_key"
export LIGHTMEM2_BASE_URL="https://your-openai-compatible-endpoint/v1"

If your machine does not need an upstream HTTP proxy, also clear:

export LIGHTMEM2_UPSTREAM_HTTP_PROXY=
export LIGHTMEM2_UPSTREAM_HTTPS_PROXY=

The smoke script will:

create a temporary OpenClaw runtime home
wire LightMem2 as a local proxy provider
start a local gateway
send a minimal Reply with exactly: pong request

4. Go Deeper

Once the basic runtime path is working, use these component-level docs:

components/README.md for the framework-level component index
components/tokenpilot/README.md for TokenPilot commands, configuration, runtime state, and debugging
components/tokenpilot/adapters/codex/README.md for Codex-specific install, command scope, and proxy runtime notes
components/tokenpilot/products/cli/package.json for the standalone lightmem2 CLI package
experiments/README.md for top-level benchmark reproduction entrypoints
experiments/tokenpilot/README.md for the current TokenPilot benchmark hub

🧩 Components

LightMem2 is intended to host multiple long-running-agent components over time.

Component	Role	Main Docs	Experiments
`TokenPilot`	Runtime component for context stabilization, reduction, and lifecycle-aware eviction	components/tokenpilot/README.md	experiments/tokenpilot/README.md

🖼️ Visual Results

The screenshots below come from the built-in visual inspector opened with:

/lightmem2 visual

TokenPilot runtime effects

Stable-prefix view:

Reduction view:

Eviction view:

🏗️ Architecture

The current public repository is organized around released component and its current production host adapter.

At a high level:

components/<name>/packages
- shared logic that should remain reusable across hosts
components/<name>/adapters
- host-specific integration code, install surfaces, runtime hooks, and command wiring

LightMem2/
├── components/
│   └── tokenpilot/
│       ├── adapters/
│       │   ├── openclaw/         # production host adapter for OpenClaw
│       │   └── codex/            # Codex CLI adapter with hooks + local proxy
│       └── packages/
│           ├── host-adapter/     # Shared host-adapter contracts and path-resolution interfaces
│           ├── runtime-core/     # Host-agnostic runtime engine and shared execution logic
│           ├── kernel/           # Shared types, interfaces, events, and runtime contracts
│           └── layers/           # Stateful and policy-oriented logic
│               ├── history/      # Canonical state, raw semantic turns, task registry
│               ├── decision/     # Policy analysis, reduction/eviction decisions, estimator
│               └── memory/       # Experimental memory layer; distillation and retrieval are still in progress
├── docs/                         # Public-facing notes and smoke helpers for the current runtime path
├── experiments/                  # Benchmark adapters and evaluation scripts for the current runtime path
└── README.md

🧪 Experiments

LightMem2 keeps benchmark adapters, task definitions, and runner scripts under:

experiments/

The root entry for experiment reproduction is:

experiments/README.md

The current component-level experiment hub is:

experiments/tokenpilot/README.md

The currently documented benchmark subtrees are:

Recommended reproduction flow:

Finish the installation steps in this root README and verify the plugin in a real OpenClaw session.
Open experiments/README.md and choose the benchmark you want to reproduce.
Open experiments/tokenpilot/README.md for the current component-level benchmark index.
Download the required benchmark data bundle from the shared Google Drive described in experiments/README.md.
Follow the benchmark-specific README for local asset placement, environment setup, and official runner commands.
Run the benchmark from its scripts/run_baseline.sh or scripts/run_method.sh entrypoint.

💡 Examples

The first in-session commands to care about are:

/lightmem2 status
/lightmem2 report
/lightmem2 doctor
/lightmem2 visual
/lightmem2 mode normal
/lightmem2 help

Use them in that order:

/lightmem2 status confirms the component is active
/lightmem2 report shows savings after a few turns
/lightmem2 doctor checks the current OpenClaw adapter installation and config surface
/lightmem2 visual opens the local visualization page for runtime effects
/lightmem2 mode <conservative|normal|aggressive> switches preset runtime behavior
/lightmem2 help shows the full command surface

For full command details, runtime state, and debugging notes, see:

components/tokenpilot/README.md

📁 Experimental Results

The tables below summarize the current LightMem2 runtime path, implemented today through the TokenPilot component, on PinchBench and Claw-Eval.

Isolated mode evaluates each task in a fresh session, focusing on single-task behavior without cross-task history carryover. Continuous mode evaluates longer-running shared-session workflows, where context accumulation and cache reuse matter much more.

For exact reproduction commands, start from:

PinchBench

Isolated Mode

Method	Overall	Prod	Res	Write	Code	Anal	CSV	Log	Meet	Mem	Skill	Integ	Cache Read (M)	Cache Miss (M)	Output (M)	Cost ($)
Vanilla	80.5	87.2	68.7	84.1	86.0	75.1	83.0	94.7	81.4	86.5	70.3	55.3	6.184	8.753	0.285	8.31
LLMLingua-2	76.9	89.3	64.0	82.1	86.9	80.8	79.6	84.4	66.3	85.0	79.6	72.1	14.241	3.975	0.384	5.78
SelectiveContext	76.5	88.5	64.5	73.0	83.7	82.6	81.1	92.8	63.3	86.9	82.8	77.2	11.273	4.642	0.324	5.79
LCM	77.8	90.1	64.9	79.6	85.4	81.3	81.0	87.1	67.5	85.0	81.7	80.6	16.018	3.064	0.356	5.10
Pichay	78.9	85.4	58.9	71.8	79.0	88.3	79.8	83.6	84.0	91.3	69.8	63.3	6.717	3.333	0.238	4.07
Summary	79.5	80.7	66.3	83.5	77.9	82.1	87.5	77.2	81.3	92.5	67.2	54.4	12.303	3.009	0.296	4.51
MemoBrain	78.1	86.8	62.1	88.9	85.7	82.6	88.3	85.4	63.6	92.5	76.1	69.7	10.200	2.107	0.233	3.36
AgentSwing	78.4	89.8	71.9	80.2	79.5	83.5	80.8	83.7	77.9	92.5	65.7	35.0	4.534	7.129	0.241	6.77
Keep-Last-N	80.4	86.0	70.0	82.4	80.1	77.6	78.3	91.5	84.3	92.5	70.1	87.8	12.813	2.657	0.291	4.26
MemOS	79.4	84.2	54.4	83.1	82.3	78.2	81.1	97.2	77.6	92.5	85.9	80.2	29.018	4.573	0.492	7.81
LightMem2	81.0	89.0	71.2	80.0	72.6	88.9	85.3	95.2	79.4	95.0	95.2	58.0	8.893	1.933	0.244	3.22

Continuous Mode

Method	Overall	Prod	Res	Write	Code	Anal	CSV	Log	Meet	Mem	Skill	Integ	Cache Read (M)	Cache Miss (M)	Output (M)	Cost ($)
Vanilla	79.2	83.5	58.4	86.8	80.0	78.5	87.8	94.6	77.6	95.0	55.8	83.6	25.015	5.943	0.202	7.24
LLMLingua-2	73.8	85.8	58.4	80.3	74.3	79.6	82.8	84.2	63.4	90.0	79.1	83.6	20.574	2.183	0.194	4.06
SelectiveContext	74.0	85.4	64.2	83.1	75.4	78.8	77.3	91.2	62.2	89.5	71.0	80.3	25.475	2.608	0.196	4.75
LCM	77.0	88.1	63.2	90.1	75.7	78.5	85.4	88.9	65.1	82.8	80.8	78.2	18.708	2.417	0.222	4.21
Pichay	76.5	88.0	66.7	76.2	81.0	77.6	83.5	84.2	67.6	100.0	63.8	75.3	11.698	6.874	0.260	7.20
Summary	78.4	89.1	64.4	73.8	82.9	69.6	81.6	93.6	80.3	95.0	61.7	75.3	20.687	6.249	0.196	7.12
MemoBrain	78.0	87.7	65.0	85.5	84.9	75.9	81.0	89.0	72.3	90.3	86.6	84.7	12.917	2.283	0.232	3.73
AgentSwing	78.5	86.3	67.3	89.0	79.1	82.4	87.4	68.1	72.4	93.8	61.7	83.8	12.680	5.476	0.314	6.47
Keep-Last-N	79.1	86.3	67.0	87.8	87.0	77.0	85.4	77.3	75.9	95.0	56.8	75.1	18.117	4.481	0.209	5.66
MemOS	80.9	87.5	59.0	85.4	87.1	82.0	81.0	95.0	78.1	92.5	87.4	84.1	30.859	8.939	0.308	10.41
LightMem2	81.3	76.7	76.9	90.6	84.1	86.0	85.6	89.1	73.6	95.0	77.2	80.1	8.551	1.549	0.219	2.79

PinchBench abbreviations: Prod=Productivity, Res=Research, Write=Writing, Code=Coding, Anal=Analysis, CSV=CSV Analysis, Log=Log Analysis, Meet=Meeting Analysis, Mem=Memory, Skill=Skills, Integ=Integrations.

Claw-Eval

Isolated Mode

Method	Overall	Wkfl	Ops	Fin	Off	Comm	Prod	Oprn	Safe	Term	MM	Oth	Cache Read (M)	Cache Miss (M)	Output (M)	Cost ($)
Vanilla	64.5	65.4	70.8	45.7	44.4	73.2	70.9	77.7	74.0	56.8	41.0	69.2	9.429	4.637	0.216	5.16
LLMLingua-2	61.9	58.7	67.5	57.6	43.3	62.9	70.1	62.4	61.0	49.6	44.0	75.2	8.169	4.043	0.182	4.44
SelectiveContext	60.7	59.1	68.2	46.3	36.9	61.5	75.5	59.2	67.2	53.1	44.0	74.7	8.271	3.862	0.181	4.31
LCM	61.2	59.0	67.3	51.1	47.7	65.9	76.6	58.4	58.6	51.4	41.5	72.2	9.776	3.543	0.172	4.17
Pichay	59.3	57.3	62.1	38.2	39.4	68.5	65.0	91.6	64.1	25.6	55.0	76.5	4.648	3.944	0.186	4.14
Summary	62.0	70.0	71.0	32.2	20.6	80.0	68.5	82.8	49.2	20.0	41.0	71.4	2.935	2.871	0.174	3.16
MemoBrain	58.0	64.5	60.5	26.1	37.6	56.1	59.9	71.0	63.4	20.0	41.0	75.3	18.182	5.118	0.332	6.69
AgentSwing	60.9	64.2	66.5	44.1	45.7	67.8	52.8	85.8	57.2	25.6	53.6	68.8	4.580	3.585	0.194	3.91
Keep-Last-N	61.8	67.1	73.8	44.7	21.6	54.5	63.6	86.2	38.4	39.4	55.0	69.1	4.229	1.845	0.186	2.54
MemOS	61.6	64.7	74.2	40.9	25.2	71.2	32.0	73.6	80.2	20.0	56.2	74.6	12.582	2.709	0.363	4.61
LightMem2	63.1	68.1	75.4	47.0	22.3	71.8	65.0	72.0	47.8	37.0	45.6	69.9	4.436	1.154	0.239	2.27

Continuous Mode

Method	Overall	Wkfl	Ops	Fin	Off	Comm	Prod	Oprn	Safe	Term	MM	Oth	Cache Read (M)	Cache Miss (M)	Output (M)	Cost ($)
Vanilla	63.4	70.8	80.3	26.7	27.8	62.2	73.4	78.4	63.6	20.0	41.0	69.6	709.845	21.981	2.622	81.52
LLMLingua-2	59.0	58.7	71.3	34.8	30.6	61.9	65.3	77.6	64.6	20.0	41.0	72.4	575.654	37.197	2.630	82.91
SelectiveContext	56.5	58.1	71.6	21.8	21.2	54.7	74.0	57.7	66.4	20.0	41.0	72.3	437.114	48.678	2.754	81.69
LCM	61.4	66.8	69.0	38.3	29.5	63.3	74.9	66.6	67.3	20.0	41.0	72.7	383.007	28.714	2.691	62.37
Pichay	61.0	69.5	63.8	40.3	24.0	63.1	67.0	94.1	52.5	21.6	41.0	71.0	97.431	63.510	1.046	59.65
Summary	61.6	63.6	74.5	35.3	20.6	55.5	70.1	87.1	66.1	69.0	42.6	66.9	59.772	10.143	1.001	16.59
MemoBrain	57.9	65.9	55.0	24.9	36.7	47.8	73.5	64.2	60.6	20.0	38.4	81.6	47.497	13.990	1.134	19.16
AgentSwing	62.2	67.6	66.5	48.6	36.8	70.0	63.8	90.7	31.7	22.4	41.0	72.8	53.776	10.027	0.907	15.63
Keep-Last-N	60.7	65.3	74.0	35.5	20.8	54.1	73.6	91.9	35.7	59.5	42.4	64.7	44.812	9.106	0.780	13.70
MemOS	57.7	55.9	65.0	56.3	22.2	44.8	64.6	68.8	89.0	20.0	39.6	71.5	49.742	25.432	0.293	24.12
LightMem2	60.8	58.8	61.8	52.5	32.1	64.2	57.3	89.2	65.8	76.8	45.2	70.9	21.430	9.928	0.338	10.58

Claw-Eval abbreviations: Wkfl=Workflow, Ops=Ops, Fin=Finance, Off=Office QA, Comm=Communication, Prod=Productivity, Oprn=Operations, Safe=Safety, Term=Terminal, MM=Multimodal, Oth=Others.

⚙️ Configuration

The exact config file and install surface depend on the host adapter:

OpenClaw: ~/.openclaw/openclaw.json
Codex CLI: ~/.codex/tokenpilot.json

Default Runtime Mode

The current install path applies normal mode by default.

conservative: stabilizer on, lighter reduction preset, eviction off
normal: stabilizer on, balanced reduction preset, eviction off
aggressive: stabilizer on, aggressive reduction preset, eviction on with task-state estimator on

You can switch modes with the host command surface:

/lightmem2 mode conservative
/lightmem2 mode normal
/lightmem2 mode aggressive

For Codex, use:

lightmem2 codex mode conservative
lightmem2 codex mode normal

Recommended starting behavior:

keep stabilizer enabled
keep reduction enabled
leave eviction off until the basic runtime path is already working

Advanced estimator options, reduction-pass tuning, memory settings, runtime state layout, and debugging details are intentionally kept out of the root README.

For full host-specific configuration, see:

📄 Citation

Please cite our paper if you use LightMem2 in your work.

@article{xu2026tokenpilot,
  title={TokenPilot: Cache-Efficient Context Management for LLM Agents},
  author={Xu, Buqiang and Xue, Zirui and Chen, Dianmou and Fu, Chenyang and Wu, Chiyu and Huang, Caiying and Jiang, Chen and Fang, Jizhan and Deng, Xinle and Chen, Yijun and others},
  journal={arXiv preprint arXiv:2606.17016},
  year={2026}
}

@inproceedings{fang2025lightmem,
  title={LightMem: Lightweight and Efficient Memory-Augmented Generation},
  author={Jizhan Fang and Xinle Deng and Haoming Xu and Ziyan Jiang and Yuqi Tang and Ziwen Xu and Shumin Deng and Yunzhi Yao and Mengru Wang and Shuofei Qiao and Huajun Chen and Ningyu Zhang},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=dyJ0GWpjJB}
}

🎉Contributors

We thank all the contributors to this project, more contributors are welcome!

Other Related Projects

LLMLingua-2 — Token-level prompt compression
SelectiveContext — Self-information-based context reduction
Pichay — Demand paging for LLM context windows
MemoBrain — Executive memory for long-horizon reasoning agents
AgentSwing — Adaptive parallel context management routing for web agents
MemOS — Memory operating system for LLM agents
LightMem — Lightweight memory-augmented generation

🙌 We thank all the contributors to this project, and welcome further contributions from the community. We also thank the authors of the baseline methods evaluated in our experiments, including LLMLingua-2, SelectiveContext, Pichay, MemoBrain, AgentSwing, and MemOS, for making their work publicly available.

Name		Name	Last commit message	Last commit date
Latest commit History 225 Commits
.github/workflows		.github/workflows
components		components
docs		docs
experiments		experiments
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

LightMem2

📑 Table of Contents

📢 News

🔧 Installation

Installation Steps

OpenClaw

Codex CLI

⚡ Quick Start

1. Use the Component Namespace

2. Verify It in a Real Session

OpenClaw

Codex CLI

3. Run the Built-In Smoke Test

4. Go Deeper

🧩 Components

🖼️ Visual Results

🏗️ Architecture

🧪 Experiments

💡 Examples

📁 Experimental Results

PinchBench

Isolated Mode

Continuous Mode

Claw-Eval

Isolated Mode

Continuous Mode

⚙️ Configuration

Default Runtime Mode

📄 Citation

🎉Contributors

Other Related Projects

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages