Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -333,6 +333,7 @@ OpenKB settings are initialized by `openkb init` and stored in `.openkb/config.y
model: gpt-5.4 # LLM model (any LiteLLM-supported provider)
language: en # Wiki output language
pageindex_threshold: 20 # PDF pages threshold for PageIndex
file_processing_jobs: 2 # Files to prepare concurrently during `openkb add <dir>`
```

Model names use `provider/model` LiteLLM [format](https://docs.litellm.ai/docs/providers) (OpenAI models can omit the prefix):
Expand All @@ -347,6 +348,8 @@ Model names use `provider/model` LiteLLM [format](https://docs.litellm.ai/docs/p
<summary><i>Advanced options (entity_types, extra_headers, OAuth):</i></summary>
<br>

`file_processing_jobs` (default `2`): number of files prepared concurrently during `openkb add <dir>`. Only the preparation stage is parallelized (hashing, duplicate prefiltering, raw/source staging, conversion); live-KB mutation stays serialized under the mutation lock, so raising it helps mainly when conversion is the bottleneck.

`entity_types` (optional): a YAML list overriding the entity-type vocabulary used for entity pages; omit it to use the default `person`, `organization`, `place`, `product`, `work`, `event`, `other`.

`extra_headers` (optional): a YAML mapping of extra HTTP headers sent with every LLM request (forwarded to LiteLLM's `extra_headers`). Useful for providers that expect custom headers, e.g. GitHub Copilot IDE-auth headers:
Expand Down
4 changes: 4 additions & 0 deletions config.yaml.example
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
model: gpt-5.4 # LLM model (any LiteLLM-supported provider)
language: en # Wiki output language
pageindex_threshold: 20 # PDF pages threshold for PageIndex
file_processing_jobs: 2 # Number of files to prepare concurrently during `openkb add <dir>`
# Note: this parallelizes hashing/conversion/staging only. Live KB publish,
# PageIndex indexing, LLM compilation, registry updates, and log writes remain
# serialized under the KB mutation lock.

# Optional: extra HTTP headers sent with every LLM request (forwarded to
# LiteLLM's extra_headers). Some providers need these — e.g. GitHub Copilot
Expand Down
Loading