🎯
Focusing
Pinned Loading
-
local-engine-router
local-engine-router PublicSingle-port OpenAI/Ollama-compatible proxy that auto-swaps a memory-constrained GPU between local inference engines by requested model.
Python 2
-
llmtop
llmtop PublicAn nvtop for local LLM inference: zero-config autodiscovery of vLLM, llama.cpp, Ollama, TGI, SGLang + live GPU and serving metrics in a Textual TUI. Unified-memory (GB10/Jetson) aware. MIT.
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.


