NLP2CMD

AI Cost Tracking

🤖 LLM usage: $39.3567 (297 commits)
👤 Human dev: ~$10784 (107.8h @ $100/h, 30min dedup)

Generated on 2026-06-08 using openrouter/deep/deep-v4-pro

Natural Language → Domain-Specific Commands — production-ready framework for transforming natural language (Polish + English) into shell, SQL, Docker, Kubernetes, browser automation, and desktop GUI commands.

Quick Start

pip install nlp2cmd[all]
nlp2cmd cache auto-setup          # Setup Playwright browsers

# Single commands
nlp2cmd "uruchom usługę nginx"    # → systemctl start nginx
nlp2cmd "znajdź pliki większe niż 100MB"  # → find . -type f -size +100MB

# Multi-step browser automation
nlp2cmd -r "otwórz openrouter.ai, wyciągnij klucz API i zapisz do .env"

# Canvas drawing with video recording
nlp2cmd -r "wejdź na jspaint.app i narysuj biedronkę" --video webm
nlp2cmd -r "wejdź na jspaint.app i narysuj pająka" --video webm
nlp2cmd -r "wejdź na jspaint.app i narysuj psa" --video webm
nlp2cmd -r "wejdź na jspaint.app i narysuj kota" --video webm
nlp2cmd -r "wejdź na jspaint.app i narysuj dziecko" --video webm


# Web form filling
nlp2cmd -r "otwórz https://example.com/kontakt i wypełnij formularz i wyślij"

Integracja z nlp2dsl / Propact

nlp2cmd to runner — planowanie i wykonanie. Struktura zapytania (IntentIR) pochodzi z pakietów nlp2dsl; na wejściu każdego zapytania uruchamiana jest analiza nlp2dsl.

Narzędzie	Rola
`nlp2dsl show`	Tylko struktura (IntentIR) — bez wykonania
`nlp2cmd plan`	Plan → Propact markdown; `--execute` → hybrid executor
`nlp2cmd -q … -r`	Legacy runtime (shell / browser / canvas)

Routing wykonania (`--execute`)

`target_kind`	Executor
`shell`, `rest`, `mcp`, `ws`	Propact (shell: fallback subprocess gdy brak CLI)
`browser`, `sql`, `desktop`	nlp2cmd `PipelineRunner`

--explain / --json pokazują execution_routes per krok.

Instalacja (dev, monorepo)

# Zalecane — jedna komenda (nlp2dsl packages + nlp2cmd[integration])
cd nlp2cmd && make update

# lub ręcznie:
cd ../nlp2dsl && ./scripts/setup-dev.sh

export NLP2CMD_INTEGRATION=1
export LANG=C.UTF-8

Uwaga: nlp2cmd wymaga pakietów z nlp2dsl/packages/ (editable). Uruchamiaj CLI z tego samego venv co make update (which nlp2cmd → .venv/bin/nlp2cmd). Stary ~/.local/bin/nlp2cmd nie widzi integracji.

make update          # przeinstaluj zależności integracji (--upgrade)
make publish         # update + bump patch + PyPI (nie uruchamiaj bez potrzeby)

Przykłady

# Struktura zapytania (nlp2dsl — bez wykonania)
nlp2dsl show "znajdz pliki *.py w src"
nlp2dsl show "znajdz pliki *.py w src" --plan

# Plan + Propact (nlp2cmd)
nlp2cmd plan "znajdz pliki *.py w src"
nlp2cmd plan "znajdz pliki *.py w src" --explain
nlp2cmd plan "znajdz pliki *.py w src" --json
nlp2cmd plan "znajdz pliki *.py w src" --execute

# Workflow REST (backend nlp2dsl)
export NLP2CMD_NLP2DSL_WORKFLOW=1 NLP2DSL_BACKEND_URL=http://127.0.0.1:8010
nlp2cmd plan "Wyslij fakture na 1500 PLN do a@b.pl" --json

# Analiza wejścia przy zwykłym generate/run
NLP2CMD_SHOW_STRUCTURE=1 nlp2cmd -q "znajdz pliki *.py w src"
nlp2cmd -q "znajdz pliki *.py w src" --explain

Zmienne środowiskowe

Zmienna	Domyślnie	Opis
`NLP2CMD_INTEGRATION`	`0`	Włącza `nlp2cmd plan` i bridge do pakietów nlp2dsl
`NLP2CMD_QUERY_INPUT`	`1`	Analiza IntentIR na wejściu (`-q`, `-r`, `plan`)
`NLP2CMD_SHOW_STRUCTURE`	`0`	Zawsze pokaż analizę tekstu
`NLP2CMD_NLP2DSL_WORKFLOW`	`0`	Planner REST via `/workflow/from-text`
`NLP2DSL_BACKEND_URL`	`http://localhost:8010`	Backend nlp2dsl
`NLP2CMD_PROPACT_BIN`	`propact`	Ścieżka do Propact CLI
`NLP2CMD_PROPACT_FALLBACK`	`shell`	Subprocess gdy brak propact (tylko shell-only)
`NLP2CMD_INTRACT_GATE`	`0`	Intract: `plan`, `show --plan`, transform (`-q`), `PipelineRunner`, kroki DOM
`NLP2CMD_ENFORCE_CLARIFICATION`	`0`	`nlp2dsl show` blokuje niską pewność (`plan` — zawsze)
`NLP2CMD_POST_CHECK`	`0`	Po `plan --execute`: walidacja stdout vs oczekiwanie
`NLP2CMD_POST_CHECK_STRICT`	`0`	Post-check: ostrzeżenia → błędy
`NLP2CMD_EMIT_TESTQL`	`0`	Eksport scenariusza TestQL z execution record (browser)
`LANG`	—	`C.UTF-8` dla polskich znaków

Pakiet KeywordIntentDetector jest kanonicznie w nlp2cmd-intent; nlp2cmd re-eksportuje go z nlp2cmd.generation.keywords.

nlp2dsl-run jest deprecated — użyj nlp2cmd plan.

Intract — kontrakty przed wykonaniem

Opcjonalna warstwa policy (Intract): zakazane efekty, wymagane inputy, zgodność z intract.yaml. Nie weryfikuje stdout.

flowchart TB
    subgraph integration [Ścieżka plan / show]
        I[IntentIR] --> P[ExecutionPlanIR]
        P --> PG[PlanStepGate]
    end

    subgraph legacy [Ścieżka -q -r]
        T[NLP2CMD.transform] --> TV[TransformValidator]
        TV --> PR[PipelineRunner]
        PR --> PG2[PipelineRunnerGate]
        PR --> SG[IntractStepGate — DOM]
    end

    PG --> RB[RuntimeBridge]
    TV --> RB
    PG2 --> RB
    SG --> RB
    RB --> M[intract.yaml + bindings.json]

export NLP2CMD_INTEGRATION=1
export NLP2CMD_INTRACT_GATE=1

nlp2cmd plan "znajdź pliki *.py w src" --json    # pole contract_check
nlp2dsl show "znajdź pliki *.py w src" --plan    # contract_check w output
nlp2cmd -q "znajdź pliki *.py" -r                # transform + pre-execute gate

Szczegóły: docs/architecture/intract-integration.md

Post-execution — stdout vs oczekiwanie

Osobna warstwa od Intract: sprawdza wynik wykonania (linie stdout, regex, returncode).

flowchart LR
    EX[plan --execute] --> OUT[stdout/stderr]
    OUT --> PC[PostExecutionChecker]
    PC -->|NLP2CMD_POST_CHECK=1| OK[passed / violations]

export NLP2CMD_POST_CHECK=1
nlp2cmd plan "znajdź pliki *.py w src" --execute --explain

Szczegóły: docs/architecture/post-execution-validation.md

Key Features

Multi-Step Browser Automation (v1.1.0)

Complex queries are automatically detected and decomposed into executable action plans:

nlp2cmd -r "otwórz tab w firefox wyciągnij klucz API z OpenRouter i zapisz do .env"
# → domain=multi_step, 7 steps: navigate → check_session → click → wait → extract_key → prompt_secret → save_env

The pipeline routes queries through:

Cache — instant lookup of previously seen multi-step patterns
ComplexQueryDetector — regex-based multi-step detection (~0.1ms)
ActionPlanner — rule-based decomposition for known services (10 API providers)
KeywordIntentDetector — single-command keyword matching (1615 templates)

16 Domains (1615+ Templates)

Domain	Examples
Shell	find, ls, grep, ps, du, df, tar, chmod, systemctl
Docker	container, image, compose, volume, network
SQL	SELECT, INSERT, CREATE, JOIN, window functions
Kubernetes	kubectl, pods, deployments, services, helm
Browser	Playwright automation, form filling, site exploration
Git	commit, branch, merge, rebase, stash, tag
DevOps	systemctl, Ansible, Terraform, CI/CD, nginx, cron
API	curl, httpie, wget, REST, GraphQL
FFmpeg	video/audio conversion, streaming, recording
Package Mgmt	apt, pip, npm, yarn, snap, flatpak, brew
Desktop	Firefox tabs, Thunderbird, xdotool/ydotool, window mgmt
Canvas	jspaint.app drawing (shapes, ladybug, text)
Remote	SSH, SCP, rsync, tmux, VPN
Data	jq, csvkit, awk, sed, sqlite, pandas
IoT	Raspberry Pi GPIO, MQTT, sensors
RAG	ChromaDB, Qdrant, embeddings, LangChain

Evolutionary Cache — Self-Learning

100% accuracy with Qwen-Coder-3B, 24,355× speedup. Template pipeline + pre-warm cache eliminate 31% of queries without LLM.

from nlp2cmd.generation.evolutionary_cache import EvolutionaryCache
cache = EvolutionaryCache()
r = cache.lookup("znajdź pliki PDF większe niż 10MB")  # template: ~15ms, cached: ~0.015ms
r = cache.lookup("znajdz pliki PDF wieksze niz 10MB")  # typo → similarity hit!

LLM Validation & Self-Repair

Every command result is automatically validated by a local LLM (qwen2.5:3b via Ollama). If the validator marks a result as fail, a cloud LLM (OpenRouter) suggests an improved command and optionally patches patterns.json/templates.json so the pipeline learns from the mistake.

Execute command → stdout/stderr
       ↓
LLM_VALIDATOR (local, ~0.5s)
  query + command + output → pass/fail + score + reason
       ↓ fail?
LLM_REPAIR (OpenRouter cloud)
  full context → improved_command + JSON patches
       ↓
Retry + update data files

# Example output after command execution:
llm_validator: pass
score: 0.90
reason: Found 1 camera with RTSP on the local network.
model: qwen2.5:3b

Configuration in .env:

LLM_VALIDATOR_ENABLED=true          # enabled by default
LLM_VALIDATOR_MODEL=qwen2.5:3b     # local Ollama model
LLM_REPAIR_ENABLED=true            # cloud repair on fail
LLM_REPAIR_MODEL=qwen/qwen-2.5-coder-32b-instruct

Test suite: python3 examples/08_llm_validation/test_validator.py — 15 test cases, 100% accuracy with qwen2.5:3b.

LLM Router — Multi-Model Routing

Smart routing across multiple LLM providers with automatic fallbacks: paid remote → free remote → local Ollama.

User prompt → classify_task() → LiteLLM Router
                                  ├── Remote (paid)   — Gemini 2.5 Pro, Qwen2.5-Coder-32B
                                  ├── Remote (free)   — Qwen2.5-VL-7B, Arcee Trinity
                                  └── Local (Ollama)  — qwen2.5:7b, qwen2.5-coder:7b, bielik

8 task specializations: vision, coding, text, polish, repair, validation, fast, planning — each with dedicated model chains. Even with zero API credits, all tasks work via local models.

from nlp2cmd.llm.router import get_router

router = get_router()
resp = await router.auto_completion("napisz zapytanie SQL dla tabeli users")
# → task=coding, model=qwen2.5-coder, content="SELECT * FROM users;"

Full configuration guide: Config README

Declarative Feedback Loop (Browser Automation)

Complex multi-step browser automation uses a schema-driven feedback loop — each step is validated, failures are classified, and repairs are escalated until a solution is found.

Step Schema (declarative)
    ↓
Execute step → Validate result (pre/post)
    ↓ failed?
Classify failure:
  schema_error      → wrong selector/URL     → Page analysis (DOM scan)
  page_state_error  → redirect/CAPTCHA/modal  → Navigate to correct page
  data_error        → not logged in / no key   → Login flow / create key
  handling_error    → browser crash / timeout   → Retry / escalate
    ↓
Repair escalation:
  1. Rule-based         ~0ms    (selector alternatives)
  2. Page analysis      ~50ms   (DOM scan for elements)
  3. Local LLM          ~500ms  (qwen2.5:3b diagnosis)
  4. Cloud LLM          ~2s     (OpenRouter 32B repair)
    ↓
Retry with repaired params (max 5 attempts)

Provider-agnostic: works across HuggingFace, OpenRouter, Anthropic, GitHub, Groq, and any SaaS site without hardcoded URLs — PageAnalyzer dynamically finds API key sections by scanning navigation links.

Test suite: python3 examples/08_llm_validation/test_feedback_loop.py — 19 test cases (failure classification + page analysis + multi-provider), 100% accuracy.

Polish Language Support

Native Polish NLP with 87%+ accuracy: lemmatization, fuzzy matching, diacritic normalization.

nlp2cmd "zainstaluj vlc"                    # → sudo apt-get install vlc
nlp2cmd "pokaż procesy zużywające pamięć"   # → ps aux --sort=-%mem | head -10
nlp2cmd "znajdź pliki .log starsze niż 2 dni"  # → find . -type f -name "*.log" -mtime +2

Desktop GUI Automation

Control desktop applications through natural language — works on both X11 and Wayland:

nlp2cmd -r "otwórz tab w firefox, wyciągnij klucz API z OpenRouter i zapisz do .env"
nlp2cmd -r "wejdź na jspaint.app i narysuj biedronkę" --video webm
nlp2cmd -r "otwórz Thunderbird i sprawdź pocztę"
nlp2cmd -r "zminimalizuj wszystko"

Supported API services for key extraction: OpenRouter, Anthropic, OpenAI, Groq, Mistral, DeepSeek, Together, GitHub, HuggingFace, Replicate.

Video Recording

Record entire browser automation sessions with video output:

# Full session recording (WebM format)
nlp2cmd -r "otwórz tab w firefox wyciągnij klucz API z OpenRouter i zapisz do .env" --video webm --execute-web

# Short video clips (3-10 seconds)
nlp2cmd -r "wejdź na jspaint.app i narysuj koło" --video webm --duration 5

Demo Recording: nlp2cmd-operouter-token.mp4 - Firefox automation with API key extraction

Web Form Filling & Site Exploration

nlp2cmd -r "otwórz https://example.com/kontakt i wypełnij formularz i wyślij"
nlp2cmd -r "wejdź na strona-docelowa.pl i wypełnij formularz kontaktu"
# → Automatically explores site, finds contact form, fills with .env data

Deep Company Extraction

nlp2cmd -r "wejdź na https://katalog-stron.pl/firmy, znajdź strony www firm i zapisz do firmy.csv"
# → Navigates profiles, extracts external websites, saves to CSV

Architecture

Query → [Cache] → [ComplexDetector] → [ActionPlanner] → [KeywordDetector] → [TemplateGenerator]
                                           ↓                    ↓
                                    ActionPlan (multi-step)   SingleCommand
                                           ↓                    ↓
                                    PipelineRunner ←────────────┘
                                    ├── ShellExecutionMixin
                                    ├── BrowserExecutionMixin
                                    ├── DesktopExecutionMixin
                                    └── PlanExecutionMixin

PipelineRunner is split into 4 mixins (~4400 LOC total) for maintainability:

Module	Lines	Responsibility
`pipeline_runner.py`	170	Core class, mixin composition
`pipeline_runner_shell.py`	217	Shell execution, safety policies
`pipeline_runner_browser.py`	1561	DOM/DQL, multi-action browser automation
`pipeline_runner_desktop.py`	389	xdotool/ydotool/wmctrl desktop control
`pipeline_runner_plans.py`	2277	Multi-step ActionPlan execution

Installation

# Full installation
pip install nlp2cmd[all]

# Specific components
pip install nlp2cmd[browser,nlp]    # Web automation + Polish NLP
pip install nlp2cmd[sql,shell]      # Database + system commands

# Desktop automation tools
nlp2cmd-install-desktop
# or: make install-desktop

# Setup Playwright browsers
nlp2cmd cache auto-setup

CLI Usage

# Generate command
nlp2cmd "pokaż pliki użytkownika"           # → find $HOME -type f
nlp2cmd --dsl sql "pokaż użytkowników"      # → SELECT * FROM users

# Execute immediately (run mode)
nlp2cmd -r "list files" --auto-confirm

# Multi-step with video
nlp2cmd -r "wejdź na jspaint.app i narysuj biedronkę" --video webm

# Full session recording
nlp2cmd -r "otwórz tab w firefox wyciągnij klucz API z OpenRouter i zapisz do .env" --video webm --execute-web

# Interactive mode
nlp2cmd --interactive

# Web automation
nlp2cmd web-schema extract https://example.com
nlp2cmd web-schema history --stats

# Cache management
nlp2cmd cache info
nlp2cmd cache auto-setup
nlp2cmd cache full-clear --yes

# Service mode (HTTP API)
nlp2cmd service --host 0.0.0.0 --port 8000

# Diagnostics
nlp2cmd doctor --fix
nlp2cmd --show-decision-tree "zapytanie"
nlp2cmd --show-schema

# nlp2dsl / Propact integration (requires nlp2cmd[integration])
nlp2cmd plan "znajdź pliki *.py w src"
nlp2cmd plan "znajdź pliki *.py w src" --execute
nlp2dsl show "znajdź pliki *.py w src"   # structure only

Python API

from nlp2cmd.generation.pipeline import RuleBasedPipeline

pipeline = RuleBasedPipeline()

# Single command
result = pipeline.process("pokaż pliki w katalogu")
print(result.command)  # ls -la .

# Multi-step (auto-detected)
result = pipeline.process("otwórz openrouter.ai, wyciągnij klucz API i zapisz do .env")
print(result.domain)       # multi_step
print(result.action_plan)  # ActionPlan with 7 steps

# Execute action plan
from nlp2cmd.pipeline_runner import PipelineRunner

runner = PipelineRunner()
result = runner.execute_action_plan(plan, dry_run=False)

# Integration bridge (nlp2dsl packages)
from nlp2cmd.bridge import plan_query_via_integration, attach_query_input

attach_query_input("znajdź pliki *.py w src", verbose=True)
payload = plan_query_via_integration("znajdź pliki *.py w src")
print(payload["intent_ir"])
print(payload["propact_markdown"])

Testing

pytest tests/ -v                    # All 1543+ tests
pytest tests/unit/ -v               # Unit tests
pytest tests/e2e/ -v                # End-to-end tests
pytest --cov=nlp2cmd --cov-report=html  # With coverage

Documentation

Document	Description
Project Structure	Architecture and module organization
Installation Guide	Setup instructions
User Guide	Complete usage tutorial
CLI Reference	CLI documentation
Python API	Python API guide
Examples Guide	Examples overview
Keyword Detection	Intent detection pipeline
Web Schema Guide	Browser automation
Cache Management	Caching system
Desktop Automation	Desktop GUI control
LLM Router Config	Multi-model routing, fallbacks, specialization
Canvas Drawing	jspaint.app drawing
Firefox Sessions	Session injection
Evolutionary Cache	Self-learning cache
Stream Protocols	SSH, RTSP, VNC, etc.
Service Mode	HTTP API service
nlp2dsl packages	IntentIR, planner, Propact bridge
Contributing	Development guidelines

License

Licensed under Apache-2.0.

Author

Tom Sapletta

Name		Name	Last commit message	Last commit date
Latest commit History 405 Commits
.devin/workflows		.devin/workflows
.github/workflows		.github/workflows
.pyqual		.pyqual
TODO		TODO
artifacts		artifacts
benchmark_output		benchmark_output
benchmarks		benchmarks
code2llm_output		code2llm_output
command_schemas		command_schemas
config		config
data		data
docker		docker
docs		docs
examples		examples
project		project
screenshots		screenshots
scripts		scripts
src		src
test_screenshots		test_screenshots
tests		tests
tools		tools
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.markdownlintignore		.markdownlintignore
.nojekyll		.nojekyll
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.test		Dockerfile.test
INSTALLATION.md		INSTALLATION.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
REPO_SPLIT.md		REPO_SPLIT.md
SUMD.md		SUMD.md
SUMR.md		SUMR.md
THERMODYNAMIC_ARCHITECTURE.md		THERMODYNAMIC_ARCHITECTURE.md
THERMODYNAMIC_INTEGRATION.md		THERMODYNAMIC_INTEGRATION.md
TICKET		TICKET
TODO.md		TODO.md
VERSION		VERSION
app.doql.less		app.doql.less
code2llm_workaround.py		code2llm_workaround.py
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
examples.sh		examples.sh
generate_chunks.py		generate_chunks.py
generate_quick.py		generate_quick.py
generate_working.py		generate_working.py
generated_appspec.json		generated_appspec.json
goal.yaml		goal.yaml
img.png		img.png
img_1.png		img_1.png
img_10.png		img_10.png
img_2.png		img_2.png
img_3.png		img_3.png
img_4.png		img_4.png
img_5.png		img_5.png
img_6.png		img_6.png
img_7.png		img_7.png
img_8.png		img_8.png
img_9.png		img_9.png
index.html		index.html
install_vnc.sh		install_vnc.sh
intract.yaml		intract.yaml
jspaint_app_test4.py		jspaint_app_test4.py
ladybug2.webm		ladybug2.webm
manual_appspec.json		manual_appspec.json
nlp2cmd-operouter-token.mp4		nlp2cmd-operouter-token.mp4
oferteo_firmy.txt		oferteo_firmy.txt
oferteo_pl_data.txt		oferteo_pl_data.txt
out_call_graph.json		out_call_graph.json
out_function_entries.json		out_function_entries.json
out_interprocedural_decision_paths.json		out_interprocedural_decision_paths.json
planfile.yaml		planfile.yaml
prefact.yaml		prefact.yaml
project.sh		project.sh
projektor.yaml		projektor.yaml
pyproject.toml		pyproject.toml
pyqual.yaml		pyqual.yaml
pytest.ini		pytest.ini
rabbit.webm		rabbit.webm
requirements-enhanced.txt		requirements-enhanced.txt
requirements-minimal.txt		requirements-minimal.txt
requirements-thermodynamic.txt		requirements-thermodynamic.txt
requirements.llm.txt		requirements.llm.txt
requirements.txt		requirements.txt
run_all_tests.sh		run_all_tests.sh
run_test.sh		run_test.sh
test_action_planner_logic.py		test_action_planner_logic.py
test_colors5.py		test_colors5.py
test_firmy.csv		test_firmy.csv
test_nlp2cmd_commands.sh		test_nlp2cmd_commands.sh
test_nlp2cmd_enhanced.sh		test_nlp2cmd_enhanced.sh
tree.sh		tree.sh
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

NLP2CMD

AI Cost Tracking

Quick Start

Integracja z nlp2dsl / Propact

Routing wykonania (--execute)

Instalacja (dev, monorepo)

Przykłady

Zmienne środowiskowe

Intract — kontrakty przed wykonaniem

Post-execution — stdout vs oczekiwanie

Key Features

Multi-Step Browser Automation (v1.1.0)

16 Domains (1615+ Templates)

Evolutionary Cache — Self-Learning

LLM Validation & Self-Repair

LLM Router — Multi-Model Routing

Declarative Feedback Loop (Browser Automation)

Polish Language Support

Desktop GUI Automation

Video Recording

Web Form Filling & Site Exploration

Deep Company Extraction

Architecture

Installation

CLI Usage

Python API

Testing

Documentation

License

Author

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Routing wykonania (`--execute`)

Packages