OpenCodeRAG is a local-first RAG plugin for semantic code search. It converts your codebase into vector indices and retrieves relevant code chunks on natural language queries. The primary aim is to save tokens by replacing full-file reads with targeted chunk retrieval and to speed-up tool calls for large codebases. Integrates seamlessly with OpenCode and works as standalone MCP server or CLI tool.
You don't need a dedicated GPU to run embedding LLMs, smaller models can still run performant on modern CPUs.
⚠️ Note: Don't confuse this with the npm packageopencode-rag(a discontinued project by a different author).
# 1. Clone and install
git clone https://github.com/your-org/OpenCodeRAG.git
cd OpenCodeRAG
npm install --legacy-peer-deps
npm run build
./install.sh
# 2. Initialize in your project
cd /path/to/your/project
opencode-rag init
# 3. Index your workspace
opencode-rag index
# 4. Search
opencode-rag query "authentication middleware"Prerequisites: Node.js v22+, Ollama (default) or other LLM-hosters with installed embedding model (e.g. embeddinggemma).
| Feature | Description |
|---|---|
| MCP server | opencode-rag mcp - stdio-based MCP server exposing search_semantic, get_file_skeleton, find_usages tools for any MCP-compatible client |
| AST chunking | 26 languages via tree-sitter (TS, JS, Python, Java, Go, Rust, C/C++, C#, Ruby, Kotlin, Swift, Bash, PHP, PowerShell, SQL, JSON, HTML, CSS, XML, YAML, TOML, INI, Dockerfile, Markdown, LaTeX, Razor) |
| Document support | Markdown, LaTeX, PDF, DOCX, DOC, Excel |
| Hybrid search | Vector similarity + TF×IDF keyword fusion |
| OpenCode plugin | Auto-inject context, read-tool override, TUI settings, Ctrl+Enter to add RAG context, MCP registration on init |
| Incremental indexing | File-hash manifest, background watcher, auto-rebuild on corruption |
| Privacy-first | All processing stays local with Ollama |
| CLI | index, query, status, list, show, dump, clear, init, ui, mcp |
| Programmatic API | TypeScript search(), indexWorkspace(), getContext(), validateConfig(), scanWorkspace(), createBackgroundIndexer(), getIndexStatusSummary() |
| Proxy-aware | Corporate proxy support with raw-socket localhost bypass |
| OpenAI / Cohere | Alternate embedding providers with API key auto-resolution |
A browser-based dashboard for exploring the indexed vector database - browse and inspect chunks and evaluate the OpenCode sessions in terms of retrieved chunks, relevance scores, and more.
Launch with opencode-rag ui. See Web UI documentation for details.
| Document | Contents |
|---|---|
| Architecture | Module design, data flow, tech stack |
| Installation | Full install guide, global setup, uninstall |
| Configuration | All options: embedding, indexing, retrieval, description, plugin |
| Chunking | Language matrix, adding new chunkers, custom chunkers |
| Embedding | Providers, model recommendations, proxy, dimension probing |
| Retrieval | Pipeline, hybrid search, score fusion, caching |
| Plugin | OpenCode integration, tools, hooks, TUI, troubleshooting |
| CLI Reference | All commands, options, examples |
| Web UI | Dashboard, chunk browser, file explorer, compare view |
| Development | Setup, testing, conventions, adding providers |
| Troubleshooting | Common issues, logging, debugging |
| Roadmap | Completed items, short/mid/long-term plans |
OpenCodeRAG ships a stdio-based MCP (Model Context Protocol) server that exposes semantic code tools to any MCP-compatible client (Claude Desktop, OpenCode, Cursor, etc.).
opencode-rag mcp| Tool | Description |
|---|---|
search_semantic |
Vector + keyword hybrid search across the indexed codebase |
get_file_skeleton |
AST-based file outline (functions, classes, methods) |
find_usages |
Find all references to a symbol by name |
Clients can configure the MCP server manually, or opencode-rag init auto-registers it.
OpenCodeRAG registers tools that agents can invoke directly. Agents discover these tools via the OpenCode skill system - when opencode-rag init runs, it creates .opencode/skills/opencode-rag/SKILL.md which teaches agents the recommended workflow:
- Skeleton first -
get_file_skeleton(filePath)to orient in a file - Find usages -
find_usages(symbolName)before editing any symbol - Search -
search_semantic(query)to find relevant code - Read - use
readon specific line ranges - Edit - make changes with full context
| Tool | Purpose | When to Use |
|---|---|---|
search_semantic |
General-purpose code retrieval | Before any code task when you haven't read the relevant code |
get_file_skeleton |
Quick file overview via AST | Before reading a large file to decide which sections matter |
find_usages |
Symbol reference search | Before editing any function, variable, or class |
read (optional) |
RAG-enhanced file read | Full file contents with supplementary context chunks |
When using OpenCode, the plugin enhances your agent with three discovery mechanisms:
opencode-rag init creates .opencode/skills/opencode-rag/SKILL.md - an OpenCode skill that teaches agents the tool workflow. Agents load it on demand via the skill tool, keeping token overhead minimal.
After every message you send, the plugin searches your vector-indexed codebase:
contentType: "file_paths"(default): A lightweight list of relevant files is appended (e.g.,src/plugin.ts (typescript, lines 10-42, relevance 0.92)). Agents must callsearch_semanticorfind_usagesto retrieve actual code — nudging proactive tool usage.contentType: "chunks": High-confidence code chunks (score ≥ 0.85) are injected directly into your prompt, giving the agent instant context without a tool-call round-trip.
When chunks are indexed, a brief tool list is prepended to the system prompt so agents know the tools exist. This is skipped when no chunks are indexed to save tokens.
Press Ctrl+Enter in the terminal prompt to retrieve and append a relevant file list to your current prompt. Press Ctrl+Alt+Enter to append full code chunks instead. The query is taken from your typed text - if the prompt is empty, a toast reminds you to type first. Results are appended directly to the prompt as formatted code blocks with file paths, line ranges, and relevance scores. No dialogs are opened. Keybindings are configurable in the settings menu (Ctrl+Shift+R).
100% local by default. Embeddings are generated locally via Ollama. The vector database stays in your project directory. No source code or embeddings leave your machine unless you explicitly configure a third-party API.
MIT
