Knowledge Graph

AI che Ragiona

Da sviluppatore a Solution Architect — Knowledge Graph, Graph-RAG e Agenti Multi-Modello in Produzione con Python, Neo4j, MCP e Microsoft Agent Framework

Giuseppe Zileni — Edizione 2026 (500+ pagine full color · Paperback · Hardcover · Kindle)

Knowledge Graph · Graph-RAG · MCP · Agenti AI

_{⏳ Il libro è in revisione editoriale. I link di acquisto saranno attivati subito dopo la pubblicazione.
Amazon's trademark is used under license from Amazon.com, Inc. or its affiliates.}

Companion repository del libro "AI che Ragiona": implementazione production-ready di un sistema Knowledge Graph completo con pipeline di ingestion documentale, vector store, graph database, hybrid RAG (vector + graph), interfaccia web, server MCP e layer di orchestrazione multi-agent.

Modules
Architecture
Tech stack
System requirements
Quick start
Local development (without Docker)
Running with Docker
Running with pre-built images (GHCR)
Repository structure
API Reference
Agent API Reference
Multi-Agent system
UI (Frontend)
Environment variables
Debugging with VS Code
Testing and linting
Ingestion pipeline
RAG pipeline (query)
Data models
Scientific references
Troubleshooting

Modules

Module	Description	README
knowledge-graph-api	FastAPI backend — ingestion, RAG, graph	README
knowledge-graph-ui	Next.js 15 web frontend	README
knowledge-graph-mcp	MCP server — exposes API as LLM tools	README
knowledge-graph-agents	Multi-agent orchestration (LangGraph)	README

Architecture

  Client / LLM Host (Claude Desktop, VS Code, custom app)
         |                            |
         | MCP Protocol               | HTTP REST
         v                            v
  +------------------+      +--------------------+
  | knowledge-graph  |      |  knowledge-graph   |
  |      -mcp        |      |      -agents       |
  | (MCP tool layer) |      | (Multi-Agent API)  |
  | localhost:8080   |      | localhost:8002     |
  +--------+---------+      +--------+-----------+
           |                         |
           +----------+--------------+
                      | HTTP REST
             +--------v---------+
             |   FastAPI API    |
             | knowledge-graph  |
             |      -api        |
             |  localhost:8000  |
             +--+-----+------+--+
                |     |      |
   +------------+  +--+--+  ++-----------+
   |               |     |               |
+--v------------+ +v----v--+ +----------v-----+
| Neo4j 5.18   | | Redis   | | Ollama         |
| Graph DB     | | Stack   | | llama3 +       |
| :7474 / :7687| | :6379   | | nomic-embed    |
+--------------+ | :8001   | | :11434         |
                 +---------+ +----------------+
                      ^
             +--------+---------+
             |   Next.js UI     |
             | knowledge-graph  |
             |      -ui         |
             |  localhost:3000  |
             +------------------+

Data flows through three main paths:

Ingestion: document → chunking → embedding (Ollama) → dedup (SHA-256) → entity/relation extraction (LLM) → storage in Redis (vectors) + Neo4j (graph)
RAG Query: question → intent classification → vector search (Redis) → graph traversal (Neo4j) → context assembly → LLM generation (Ollama) → answer
Multi-Agent: request → Orchestrator (LangGraph) → specialised agent → HTTP tools to API → structured response

Tech stack

Component	Technology	Version
Graph Database	Neo4j (Cypher + APOC)	5.18
Vector Store	Redis for AI (RedisSearch + RedisJSON)	latest
LLM Inference	Ollama (local, no API key)	latest
LLM Model	Llama 3	latest
Embedding Model	nomic-embed-text (768 dim)	latest
REST API	FastAPI + uvicorn	0.115+
Data Models	Pydantic v2 + pydantic-settings	2.7+
Multi-Agent	Microsoft Agent Framework (MAF)	latest
Frontend	Next.js + React + Tailwind CSS	15 / 19 / 4
Graph Visualisation	react-force-graph-2d	1.26+
Logging	structlog (JSON in prod, console in dev)	24.1+
Testing	pytest + pytest-asyncio + pytest-mock	8.2+
Linting	ruff (API/Agents), ESLint (UI)	0.4+
Containerisation	Docker + Docker Compose	24+ / v2

System requirements

Docker 24+ and Docker Compose v2
8 GB RAM recommended (Ollama + Neo4j + Redis)
Python 3.11+ (only for local API development without Docker)
Node.js 22+ (only for local UI development without Docker)
NVIDIA GPU optional (to accelerate Ollama)

Quick start

The fastest way to get everything running:

# 1. Clone and move into the root
git clone <repo-url>
cd knowledge-graph

# 2. Configure environment variables
cp .env.example .env
# Edit NEO4J_PASSWORD and other values in .env

# 3. Start all services (production stack)
make up-prod

# 4. Download Ollama models (first time only)
make pull-models

# 5. (Optional) Seed with sample data
cd knowledge-graph-api && make seed

# 6. Open in the browser
#    UI:            http://localhost:3000
#    API Swagger:   http://localhost:8000/docs
#    Agent API:     http://localhost:8002/docs
#    Neo4j:         http://localhost:7474
#    RedisInsight:  http://localhost:8001

Local development (without Docker)

Run only the infrastructure in Docker and the application servers natively for a hot-reload development experience.

1. Infrastructure (Neo4j + Redis + Ollama + RedisInsight)

cd knowledge-graph
make up-dev
# or: docker compose --profile dev up -d

Wait for services to become healthy:

docker compose ps

2. Ollama models (first time only)

make pull-models

3. API (FastAPI)

cd knowledge-graph-api

python -m venv venv
source venv/bin/activate        # Linux / macOS
# venv\Scripts\activate         # Windows

pip install -r requirements.txt
uvicorn api.main:app --reload --port 8000

API available at http://localhost:8000. Interactive Swagger docs at http://localhost:8000/docs.

4. Agents (Multi-Agent API)

cd knowledge-graph-agents

python -m venv venv
source venv/bin/activate

pip install -r requirements.txt
uvicorn api.agent_api:app --reload --port 8001

Agent API available at http://localhost:8001. Swagger at http://localhost:8001/docs.

# Example call
curl -X POST http://localhost:8001/agents/run \
  -H "Content-Type: application/json" \
  -d '{"request": "What do you know about Neo4j?", "thread_id": "default"}'

5. UI (Next.js)

cd knowledge-graph-ui

cp .env.local.example .env.local
# Verify NEXT_PUBLIC_API_URL=http://localhost:8000

npm install
npm run dev

UI available at http://localhost:3000.

Running with Docker

Docker Compose profiles

The stack uses Docker Compose profiles to separate environments:

Profile	Services started	Use case
`dev`	neo4j, redis, ollama, redisinsight	Local development — run apps outside Docker
`prod`	neo4j, redis, ollama, api, ui, mcp, agents	Full production stack

Makefile targets

make up-dev     # infrastructure + RedisInsight (profile dev)
make up-prod    # full production stack (profile prod)
make down       # stop all services
make pull-models  # download llama3 + nomic-embed-text

Production

cp .env.example .env
# Configure .env with real passwords

make up-prod

Services exposed:

Service	URL	Description
UI	http://localhost:3000	Next.js frontend
API	http://localhost:8000	FastAPI REST API
API Docs	http://localhost:8000/docs	Swagger UI (API)
Agent API	http://localhost:8002	Multi-Agent Orchestration API
Agent Docs	http://localhost:8002/docs	Swagger UI (Agent API)
MCP Server	http://localhost:8080	MCP tool layer (SSE transport)
Neo4j Browser	http://localhost:7474	Neo4j web interface
RedisInsight	http://localhost:8001	Redis web interface (built-in)
Ollama	http://localhost:11434	Ollama API

Development (infra only)

make up-dev

This starts only Neo4j, Redis, Ollama and a standalone RedisInsight on port 5540. Run API, UI, Agents and MCP locally with hot-reload (see Local development).

Additional services when using --profile dev:

Service	URL	Description
RedisInsight	http://localhost:5540	Standalone Redis UI (profile dev)
Neo4j Browser	http://localhost:7474	Built into neo4j container

Useful Docker commands

# Service status
docker compose ps

# Follow logs of a specific service
docker compose logs -f api
docker compose logs -f ui

# Stop all services
make down

# Stop and remove volumes (WARNING: deletes all Neo4j/Redis data)
docker compose down -v

# Rebuild a single service
docker compose up --build api -d

NVIDIA GPU (optional)

GPU acceleration for Ollama is configured in docker-compose.yml under the deploy section of the ollama service — it is enabled by default and requires the NVIDIA Container Toolkit.

Running with pre-built images (GHCR)

The easiest way to run the full stack locally without cloning the source code or building any image. Every CI-green merge to main publishes four images to the GitHub Container Registry:

Image	Description
`ghcr.io/agent-engineering-studio/kg-api:latest`	FastAPI backend
`ghcr.io/agent-engineering-studio/kg-ui:latest`	Next.js frontend
`ghcr.io/agent-engineering-studio/kg-mcp:latest`	MCP server
`ghcr.io/agent-engineering-studio/kg-agents:latest`	Multi-agent API

1. Download the compose file

curl -O https://raw.githubusercontent.com/agent-engineering-studio/knowledge-graph/main/docker-compose.ghcr.yml
curl -O https://raw.githubusercontent.com/agent-engineering-studio/knowledge-graph/main/.env.example
cp .env.example .env
# Edit .env: set NEO4J_PASSWORD and, if needed, OLLAMA_BASE_URL

2. Choose your Ollama setup

Option A — Ollama already running on the host (recommended if you already have models):

# Set in .env:
# OLLAMA_BASE_URL=http://host.docker.internal:11434
docker compose -f docker-compose.ghcr.yml up -d

Option B — Ollama CPU container (no GPU):

docker compose -f docker-compose.ghcr.yml --profile cpu up -d

# Pull models (first time only)
docker compose -f docker-compose.ghcr.yml exec ollama-cpu \
  sh -c "ollama pull llama3 && ollama pull nomic-embed-text"

Option C — Ollama GPU container (NVIDIA, requires NVIDIA Container Toolkit):

docker compose -f docker-compose.ghcr.yml --profile gpu up -d

docker compose -f docker-compose.ghcr.yml exec ollama-gpu \
  sh -c "ollama pull llama3 && ollama pull nomic-embed-text"

3. Access the services

Service	URL
UI	http://localhost:3000
API + Swagger	http://localhost:8000/docs
Agent API + Swagger	http://localhost:8002/docs
MCP Server (SSE)	http://localhost:8080
Neo4j Browser	http://localhost:7474
RedisInsight	http://localhost:5540

4. Useful commands

# Check all containers are up
docker compose -f docker-compose.ghcr.yml ps

# Follow logs of a specific service
docker compose -f docker-compose.ghcr.yml logs -f api

# Pull latest images and restart
docker compose -f docker-compose.ghcr.yml pull
docker compose -f docker-compose.ghcr.yml up -d

# Stop everything
docker compose -f docker-compose.ghcr.yml down

# Stop and remove all data (WARNING: deletes Neo4j + Redis volumes)
docker compose -f docker-compose.ghcr.yml down -v

Note: the GHCR packages for this repository are public. No docker login is required to pull them.

Repository structure

knowledge-graph/
├── .vscode/                        # VS Code configuration (debug, tasks, settings)
│   ├── launch.json                 # Debug configurations
│   ├── tasks.json                  # Build/run tasks
│   └── settings.json               # Editor settings
├── docker-compose.yml              # Full stack (profiles: dev, prod)
├── Makefile                        # Shorthand commands
├── .env.example                    # Environment variable template
│
├── knowledge-graph-api/            # Backend API (Python / FastAPI)
│   ├── api/                        # FastAPI app, routes, schemas
│   │   ├── main.py                 # Application entry point
│   │   ├── schemas.py              # Pydantic request/response models
│   │   └── routes/
│   │       ├── ingest.py           # POST /ingest
│   │       └── query.py            # POST /query, POST /query/stream
│   ├── config/
│   │   └── settings.py             # Centralised configuration (pydantic-settings)
│   ├── models/                     # Domain models
│   │   ├── base.py                 # VectorDocument
│   │   ├── graph_node.py           # GraphNode (KGNode)
│   │   └── relation.py             # Relation
│   ├── pipeline/                   # Ingestion pipeline
│   ├── query/                      # Query pipeline
│   ├── storage/                    # Persistence backends
│   ├── infra/docker/Dockerfile     # API Dockerfile
│   └── requirements.txt
│
├── knowledge-graph-agents/         # Multi-Agent Orchestration (Python / LangGraph)
│   ├── agents/                     # Specialised agents
│   ├── orchestration/              # LangGraph workflow
│   ├── tools/kg_tools.py           # Async HTTP wrappers for the API
│   ├── memory/kg_memory.py         # AgentRunRecord + in-process store
│   ├── api/agent_api.py            # FastAPI app port 8001 (host 8002 in Docker)
│   ├── Dockerfile
│   └── requirements.txt
│
├── knowledge-graph-mcp/            # MCP Server (Python / FastMCP)
│   ├── src/kg_mcp/
│   │   ├── server.py               # MCP server + tool definitions
│   │   ├── api_client.py           # HTTP client to the API
│   │   └── tools.py                # 8 tool implementations
│   ├── Dockerfile
│   └── pyproject.toml
│
└── knowledge-graph-ui/             # Frontend (Next.js / React)
    ├── src/
    │   ├── app/                    # Next.js App Router pages
    │   ├── components/             # Reusable React components
    │   └── lib/api-client.ts       # Typed fetch wrapper
    ├── Dockerfile
    └── package.json

API Reference

Interactive OpenAPI documentation generated automatically by FastAPI:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc
OpenAPI JSON: http://localhost:8000/openapi.json

Endpoints

`GET /health` — Health check

Checks connectivity with Neo4j, Redis and Ollama.

Response (HealthResponse):

{
  "status": "healthy",
  "neo4j": true,
  "redis": true,
  "ollama": true
}

status is "healthy" when all services are reachable, "degraded" otherwise.

`POST /ingest` — Document ingestion

Uploads a document, processes it through the full pipeline (chunking, embedding, entity extraction) and persists it in Redis and Neo4j.

Request body (IngestRequest):

{
  "file_path": "/path/to/document.pdf",
  "thread_id": "my-project",
  "skip_existing": true
}

Field	Type	Default	Description
`file_path`	string	required	Path to the file to process (PDF, DOCX, TXT)
`thread_id`	string	required	Namespace for multi-tenant isolation
`skip_existing`	boolean	`true`	Skip already-indexed chunks (dedup via SHA-256)

Response (IngestResult):

{
  "document_id": "a1b2c3d4-...",
  "chunks_processed": 15,
  "chunks_skipped": 0,
  "entities_extracted": 23,
  "relations_extracted": 18,
  "nodes_created": 23,
  "edges_created": 18,
  "processing_time_ms": 12450.5,
  "errors": []
}

Supported formats: .pdf (pypdf), .docx (python-docx), .txt (plain text).

`POST /query` — RAG query (synchronous)

Executes a hybrid RAG query: vector search + graph traversal + LLM generation.

Request body (QueryRequest):

{
  "query": "Which technologies are connected to Neo4j?",
  "thread_id": "my-project",
  "top_k": 10,
  "max_hops": 2
}

Field	Type	Default	Description
`query`	string	required	Natural-language question
`thread_id`	string	required	Namespace to query
`top_k`	integer	`10`	Number of vector search results
`max_hops`	integer	`2`	Maximum graph traversal depth

Response (RAGResponse):

{
  "answer": "Neo4j is connected to...",
  "sources": [
    { "doc_id": "chunk-uuid", "text_preview": "First 200 chars...", "score": 0.876 }
  ],
  "nodes_used": ["node-id-1"],
  "edges_used": ["NodeA --USES--> NodeB"],
  "query_intent": "entity_query",
  "processing_time_ms": 3200.0
}

query_intent can be: document_query, entity_query, relation_query, general.

`POST /query/stream` — RAG query (SSE streaming)

Same request as /query, but the response is a stream of Server-Sent Events. Each LLM token is sent as an event:

data: Neo4j
data:  is
data:  connected
data:  to...
data: [DONE]

On error: data: [ERROR] message.

`DELETE /documents/{document_id}` — Delete document

Removes a document and all its chunks from Redis.

Response:

{ "deleted": "a1b2c3d4-..." }

Agent API Reference

The Agent API exposes the multi-agent system at http://localhost:8002 (internal port 8001).

Swagger UI: http://localhost:8002/docs

Agent Endpoints

`POST /agents/run` — Execute multi-agent workflow

Receives a natural-language request, classifies the intent, executes the agent plan and returns structured output.

Request body:

{
  "request": "What do you know about Neo4j?",
  "thread_id": "default",
  "context": {}
}

Field	Type	Default	Description
`request`	string	required	Natural-language request
`thread_id`	string	`default`	KG namespace to operate on
`context`	dict	`{}`	Extra params (e.g. `file_path`, `topic`)

Response (AgentRunResponse):

{
  "run_id": "uuid",
  "intent": "query",
  "output": "Neo4j is a graph database...",
  "plan": [{"agent": "analyst", "action": "hybrid_search", "status": "done"}],
  "quality": {"overall_health": 0.85, "total_nodes": 120},
  "duration_ms": 2340,
  "error": null
}

Automatically classified intents:

Keywords in request	Intent	Delegated agent
ingest, upload, load	`ingest`	Ingestion Agent
what do you know, describe, tell me	`query`	Analyst Agent
analyse, count, statistics	`analyze`	Analyst Agent
report, generate, summarise	`synthesize`	Synthesis Agent
validate quality, check	`validate`	Validator Agent
missing relations, gap	`kgc`	KGC Agent
health, status, monitor	`monitor`	Monitor Agent

`GET /agents/run/{run_id}` — Retrieve a run

Returns the persisted record of a previous execution.

`GET /agents/runs` — List recent runs

Returns the last N runs (default 20), ordered by date descending.

`GET /agents/health` — Agent API health check

{ "status": "ok", "kg_api": true, "kg_api_url": "http://localhost:8000" }

Multi-Agent system

The knowledge-graph-agents/ module implements the Supervisor + Specialists pattern.

Internal architecture

                    ┌──────────────────┐
                    │   ORCHESTRATOR   │
                    │  (Router+Planner)│
                    └────────┬─────────┘
                             │  delegates by intent (LangGraph)
          ┌──────────────────┼──────────────────┐
          │                  │                  │
   ┌──────▼──────┐   ┌───────▼──────┐   ┌──────▼──────┐
   │  INGESTION  │   │   ANALYST    │   │  SYNTHESIS  │
   │   AGENT     │   │    AGENT     │   │    AGENT    │
   └──────┬──────┘   └───────┬──────┘   └──────┬──────┘
          │                  │                  │
   ┌──────▼──────┐   ┌───────▼──────┐   ┌──────▼──────┐
   │  VALIDATOR  │   │     KGC      │   │   MONITOR   │
   │   AGENT     │   │    AGENT     │   │    AGENT    │
   └─────────────┘   └──────────────┘   └─────────────┘
                             │ HTTP REST
                    ┌────────▼─────────┐
                    │ knowledge-graph  │
                    │      -api        │
                    │  Neo4j + Redis   │
                    └──────────────────┘

Agents

Agent	Responsibility
Orchestrator	Classifies intent, builds plan — never executes tools directly
Ingestion	Health check, dedup check, `kg_ingest`, report
Analyst	Vector search / graph traversal / hybrid (3 strategies)
Validator	4 Cypher queries, `KGQualityReport` with `overall_health`
KGC	Transitive closure + similarity, finds missing relations
Synthesis	RAG context + Ollama, Markdown report (optional auto-ingest)
Monitor	Health check + quick quality check, alert summary

Agent memory

Each execution is recorded as an AgentRunRecord (Pydantic) in the in-process store and, best-effort, as an AgentRun node in Neo4j via POST /graph/cypher/write.

MATCH (r:AgentRun {run_id: $run_id})
RETURN r.agent_name, r.intent, r.status, r.duration_ms

Agent tests

cd knowledge-graph-agents
pytest tests/ -v

Tests use httpx mocks — no live services needed.

UI (Frontend)

The UI is a Next.js 15 (App Router) SPA with three main pages.

Dashboard (`/`)

Real-time service health indicators (Neo4j, Redis, Ollama)
Quick links to functional pages
Direct access to Swagger, Neo4j Browser, RedisInsight

Search / Query (`/query`)

Search form with configurable parameters (thread_id, top_k, max_hops)
SSE streaming support: tokens appear in real time during generation
Structured result view: answer, sources with score, metadata (intent, nodes, edges, time)

Graph View (`/graph`)

Query input to explore sections of the knowledge graph
Interactive force-directed visualisation (react-force-graph-2d)
Colour-coded nodes by type, edges with arrows and relation labels
Zoom, pan and drag

Typed API client

src/lib/api-client.ts is the single contract between UI and API:

TypeScript interfaces mirroring the Pydantic models
Typed functions for every endpoint (getHealth, postQuery, postIngest, deleteDocument)
Async generator streamQuery() for SSE streaming
AbortSignal support for request cancellation

UI configuration

Copy .env.local.example to .env.local:

cd knowledge-graph-ui
cp .env.local.example .env.local

Variable	Default	Description
`NEXT_PUBLIC_API_URL`	`http://localhost:8000`	API base URL
`NEXT_PUBLIC_ENABLE_STREAMING`	`true`	Enable/disable SSE streaming
`NEXT_PUBLIC_ENABLE_GRAPH_VIEW`	`true`	Feature flag for graph view

Environment variables

All variables are defined in .env.example at the root. Copy to .env and customise:

cp .env.example .env

Neo4j

Variable	Default	Description
`NEO4J_URI`	`bolt://neo4j:7687`	Connection URI (use `localhost` for local dev)
`NEO4J_USER`	`neo4j`	Username
`NEO4J_PASSWORD`	`yourpassword`	Change this in production
`NEO4J_DATABASE`	`neo4j`	Database name

Redis

Variable	Default	Description
`REDIS_URL`	`redis://redis:6379`	Connection URL (use `localhost` for local dev)
`REDIS_INDEX_NAME`	`kg_vectors`	Vector index name
`REDIS_VECTOR_DIM`	`768`	Vector dimension (depends on embedding model)

Ollama

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://ollama:11434`	Ollama URL (use `localhost` for local dev)
`OLLAMA_LLM_MODEL`	`llama3`	Text generation model
`OLLAMA_EMBEDDING_MODEL`	`nomic-embed-text`	Embedding model (768 dim)

Chunking

Variable	Default	Description
`CHUNK_SIZE`	`1024`	Maximum chunk size (characters)
`CHUNK_OVERLAP`	`128`	Overlap between consecutive chunks (characters)

Application

Variable	Default	Description
`LOG_LEVEL`	`INFO`	Log level (DEBUG, INFO, WARNING, ERROR)
`DEBUG`	`false`	Debug mode

Note: for local development without Docker, NEO4J_URI, REDIS_URL and OLLAMA_BASE_URL must use localhost instead of Docker container names.

Debugging with VS Code

Open the knowledge-graph/ folder in VS Code. The .vscode/ directory contains ready-to-use configurations.

Recommended extensions

Python (ms-python.python)
Ruff (charliermarsh.ruff)
Prettier (esbenp.prettier-vscode)
JavaScript Debugger (built-in)

Debug configurations (launch.json)

Name	Type	Description
API: FastAPI (debugpy)	Python	Starts uvicorn with Python debugger, hot-reload
UI: Next.js (Server)	Node	Starts `npm run dev` and attaches Chrome debugger
UI: Next.js (Chrome)	Chrome	Attaches to a running Next.js server on :3000
API: Tests (pytest)	Python	Runs pytest with step-through debugger
Agents: API (debugpy)	Python	Starts Agent API with debugger on port 8001
Agents: Orchestrator (debugpy)	Python	Runs the LangGraph orchestrator directly
MCP: Server (debugpy)	Python	Starts the MCP server with debugger
Full Stack: API + UI	Compound	Starts API + UI in parallel with one click
Full Stack: All Services	Compound	Starts API + UI + MCP + Agents

Recommended workflow

Start infrastructure: make up-dev
In VS Code, select "Full Stack: API + UI" in the Run and Debug panel
Press F5 — API (port 8000) and UI (port 3000) start with active debuggers
Set breakpoints in Python (API) or TypeScript (UI) code
Open http://localhost:3000 in the browser

Tasks (tasks.json)

Accessible from Terminal > Run Task...:

Task	Command
Docker: Up Prod	`docker compose --profile prod up --build -d`
Docker: Up Dev (infra + tools)	`docker compose --profile dev up -d`
Docker: Down	`docker compose --profile prod --profile dev down`
API: Dev Server	`uvicorn api.main:app --reload`
UI: Dev Server	`npm run dev`
API: Run Tests	`pytest tests/ -v`
API: Lint	`ruff check .`
Pull Ollama Models	`ollama pull llama3 + nomic-embed-text`

Testing and linting

API (Python)

cd knowledge-graph-api

pytest tests/ -v                        # run all tests
pytest tests/test_ingest.py -v -k "test_name"  # specific test
ruff check .                            # lint
ruff check . --fix                      # auto-fix

Tests use mocks for Neo4j, Redis and Ollama — no live services needed.

Agents (Python)

cd knowledge-graph-agents

pytest tests/ -v
ruff check .

UI (TypeScript)

cd knowledge-graph-ui
npm run lint

Makefile targets (from repo root)

make test           # pytest (API)
make lint           # ruff (API)
make agents-test    # pytest (Agents)
make agents-lint    # ruff (Agents)
make mcp-test       # pytest (MCP)

Ingestion pipeline

The ingestion pipeline (POST /ingest) processes a document in 8 stages:

Document
    |
    v
[1] File Routing -------> MIME type detection (PDF / DOCX / TXT)
    |
    v
[2] Content Extraction -> Raw text + page count
    |
    v
[3] Text Chunking ------> 1024-char chunks, 128-char overlap
    |                     (respects sentence boundaries)
    v
[4] Embedding ----------> 768-D vectors via Ollama (nomic-embed-text)
    |
    v
[5] Deduplication ------> SHA-256 hash to skip existing chunks
    |
    v
[6] Entity Extraction --> LLM extracts entities (Person, Technology, ...)
    |                     and relations (USES, PART_OF, ...)
    v
[7] Vector Storage -----> Upsert into Redis (RedisSearch + RedisJSON)
    |
    v
[8] Graph Storage ------> Nodes and edges in Neo4j (MERGE/upsert)

Supported entity types

Person, Organization, Product, Technology, Process, Event, Location, Concept, Document, Category, Tag

Supported relation types

BELONGS_TO, RELATES_TO, CREATED_BY, MENTIONS, PART_OF, USES, LOCATED_IN, OCCURRED_AT, HAS_TAG, SIMILAR_TO, DEPENDS_ON, REPLACED_BY

RAG pipeline (query)

The RAG pipeline (POST /query) answers questions in 5 stages:

User question
    |
    v
[1] Intent Classification -> document_query | entity_query
    |                                        | relation_query | general
    v
[2] Vector Search ---------> Top-K documents by cosine similarity
    |                         (Redis KNN)
    v
[3] Graph Enrichment ------> Traversal of neighbours up to max_hops
    |                         (Neo4j Cypher)
    v
[4] Context Assembly ------> System prompt with chunks + nodes + edges
    |
    v
[5] LLM Generation -------> Response (sync JSON or SSE stream)

The search is hybrid: it combines semantic similarity (vector) with structural relations (graph) to produce more complete and context-aware answers.

Data models

VectorDocument (Redis)

Each document chunk is stored in Redis as JSON with a vector index:

Field	Type	Description
`id`	UUID	Unique chunk identifier
`thread_id`	string	Namespace / partition
`text`	string	Chunk text content
`name`	string	Source filename
`vector`	float[768]	Chunk embedding
`content_hash`	string	SHA-256 for deduplication
`base_document_id`	string	Parent document ID
`mime_type`	string	Original file MIME type
`page_number`	integer	Page number (PDFs)

GraphNode (Neo4j)

Each extracted entity is stored as a node in the graph:

Field	Type	Description
`id`	UUID	Unique identifier
`name`	string	Entity name
`label`	string	Display label
`node_type`	string	Type (Person, Technology, ...)
`namespace`	string	Namespace / partition
`importance`	float	Score 0-1
`confidence`	float	Score 0-1
`source_chunk_ids`	string[]	References to source chunks

Relation (Neo4j)

Each extracted relation becomes an edge in the graph:

Field	Type	Description
`id`	UUID	Unique identifier
`source_id`	string	Source node
`target_id`	string	Target node
`relation_type`	string	Type (USES, PART_OF, ...)
`weight`	float	Relation strength 0-1
`confidence`	float	Extraction confidence 0-1

Scientific references

This project draws inspiration from the following pipelines and papers:

Paper / Tool	Usage in this project
OpenIE6 (Kolluru et al., 2020)	Patterns for open-domain triple extraction
CoDe-KG (Anuyah et al., 2025)	Modular pipeline: coreference + decomposition + RE
KGGen (Mo et al., 2025)	Entity clustering/dedup to reduce graph sparsity
BLINK (Wu et al., 2019)	Bi-encoder + cross-encoder architecture for entity linking
DocRED (Yao et al., 2019)	Benchmark for document-level relation extraction

Hybrid architecture recommended by the papers: Graph DB (Neo4j/Cypher) for structure + Vector DB (Redis/FAISS) for semantic similarity, with the option of vector indexes directly in Neo4j (CREATE VECTOR INDEX).

Troubleshooting

Ollama does not start or does not respond

docker compose logs ollama

# If the container is up but models are not downloaded
docker compose exec ollama ollama list
make pull-models

The first query after pulling models may be slow (~30s) due to model loading.

Neo4j healthcheck fails

docker compose logs neo4j

# Verify the password matches
echo $NEO4J_PASSWORD   # must match the value in .env

API returns "degraded" on health check

One or more backend services are unreachable. Check which ones return false:

curl http://localhost:8000/health

Verify all containers are running: docker compose ps

UI cannot connect to the API

Verify NEXT_PUBLIC_API_URL is set correctly in .env.local:

Local dev: http://localhost:8000
Docker: http://localhost:8000 (the browser calls the API directly)
Cross-host: configure CORS on the API (currently allow_origins=["*"])

Memory errors

The full stack requires ~6-8 GB RAM. If Docker has lower limits:

docker stats

# Use a smaller Ollama model or disable APOC if not needed

Full data reset

# WARNING: deletes all data!
docker compose down -v
make up-prod

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.claude/commands		.claude/commands
.github		.github
.vscode		.vscode
assets		assets
books/notebooks		books/notebooks
docs/superpowers		docs/superpowers
infra/ollama		infra/ollama
knowledge-graph-agents		knowledge-graph-agents
knowledge-graph-api		knowledge-graph-api
knowledge-graph-mcp		knowledge-graph-mcp
knowledge-graph-ui		knowledge-graph-ui
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.ghcr.yml		docker-compose.ghcr.yml
docker-compose.yml		docker-compose.yml
prompt_multi_agent_implementation.md		prompt_multi_agent_implementation.md

Uh oh!