feat(api): pluggable LLM provider (Ollama/Anthropic) + /ingest/json#2
Conversation
Extraction-time LLM is now decoupled from the legacy Ollama-only path
through a small ``LLMProvider`` ABC under ``pipeline/llm/``. The active
provider is selected by ``KG_LLM_PROVIDER`` (default ``ollama``); set it
to ``anthropic`` to route extraction to Claude Haiku via the official
``anthropic`` SDK. Embeddings stay on Ollama unconditionally — the
vector index is sized for ``nomic-embed-text``.
Also adds ``POST /ingest/json`` for ingesting structured news articles
without writing files on the client side. The endpoint serialises
``{title, body, url, source, published_at, namespace}`` to a temp .txt
and feeds it through the existing pipeline (no router changes needed —
.txt is already supported).
Tests: 14 new tests (extractor refactor, both providers, factory, JSON
ingest serialisation contract). Full suite 23/23 green.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 798f5c34a7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if provider == "anthropic" and not settings.ANTHROPIC_API_KEY: | ||
| logger.warning("health_anthropic_no_key") |
There was a problem hiding this comment.
Include the Anthropic key check in health status
When KG_LLM_PROVIDER=anthropic but ANTHROPIC_API_KEY is unset, this branch only logs a warning; all_ok still depends solely on Neo4j, Redis, and Ollama, so /health can report healthy even though the first ingestion will fail during EntityExtractor construction because AnthropicProvider.__init__ raises RuntimeError. This makes health checks miss a configuration that renders the configured extraction provider unusable.
Useful? React with 👍 / 👎.
CodeQL flagged the (whitelist-sanitised) safe_stem as "uncontrolled data in path expression". The readable-filename feature was debug ergonomics, not a contract — the URL and source already live inside the file body. Use NamedTemporaryFile's random default name instead. Tests untouched (the contract is the file contents, not the path).
Summary
LLMProviderABC underpipeline/llm/(base + Ollama + Anthropic + factory) so the extraction-stage LLM is decoupled from Ollama. Selection via env:KG_LLM_PROVIDER=ollama(default) oranthropic.anthropicSDK with Claude Haiku (claude-haiku-4-5) by default — extraction is high-volume narrow JSON, the right tool for Haiku.POST /ingest/jsonfor ingesting structured news articles directly (no client-side filesystem). Payload{title, body, url, source, published_at, namespace}is serialised to a temp.txtand fed through the existing pipeline (no router changes —.txtis already supported).nomic-embed-text(768 dims), so embeddings always go to Ollama regardless ofKG_LLM_PROVIDER.llm_providerfield; existingollamaboolean stays for backward compat.Motivation
We are integrating this knowledge graph as the news-aware backend for a separate trading-intelligence product. That product runs entirely on Anthropic Claude (Sonnet 4.6 for predictions, Haiku 4.5 for classification) and has no local Ollama infrastructure. Routing extraction to Claude Haiku lets the KG slot in without standing up Ollama, while keeping the Ollama path 100% intact for users who prefer local inference.
/ingest/jsonis the matching ingress shape: news pipelines emit structured items, not files.Test plan
pytest tests/→ 23/23 green (14 new + 9 pre-existing untouched).ruff checkclean on every touched file.KG_LLM_PROVIDER=ollamabehaviour unchanged (existingtest_ingest_txt_filestill uses the Ollama HTTP mock and passes).AnthropicProviderrejects construction with missing API key (RuntimeError).Backward compatibility
KG_LLM_PROVIDER=ollamapreserves the current behaviour byte-for-byte.HealthResponse.ollamastays present and truthful (/api/tagsreachability).🤖 Generated with Claude Code