Skip to content

claude-code-chat-browser: FTS search index, search UX, and distinct /api/search error codes #110

Description

@clean6378-max-it

Calendar Day

Monday, July 7, 2026 (PR 2 of 2)

Planned Effort

8 story points (Medium–High)

Companion PR: Fri PR 1 (session list summary cache) — independent files; merge PR 1 first
(shared cache dir under ~/.claude-code-chat-browser/).

Problem

Three search gaps — the same bundle cppa shipped on
cppa-cursor-browser PR #113 (FTS
half, out of scope for
cppa-cursor-browser #95) plus
cppa-cursor-browser #117


Part A — Search throughput (cppa-cursor-browser PR #113 FTS analog).

GET /api/search is a brute-force substring scan: for every project, every .jsonl session
file, it calls get_cached_session() (full JSONL parse) and walks every message in memory.
Complexity is O(projects × sessions × messages) per query.
claude-code-chat-browser #82
only caches parsed sessions in memory — it does not reduce work on the first search after restart. On a power-user
install (20+ projects, 300+ sessions), a single search can take tens of seconds.
tests/benchmarks/test_search_bench.py exercises live-scan only; there is no indexed path.

cppa solved this with a derived search_index.sqlite (SQLite FTS5), mtime-keyed fingerprint
invalidation, background rebuild, and live-scan fallback. Claude has flat .jsonl files under
~/.claude/projects/ — the index must be designed for line-by-line JSONL extraction, not
ported verbatim from cppa's services/search_index.py.


Part B — Search behavior invisible to users (cppa-cursor-browser #117 item 12).

Results are capped at 50 by default (up to 500 via ?limit=N). There is no default time window
(cppa defaults to 30 days + all_history opt-in). Neither constraint is explained in the UI.
Users cannot tell whether results are truncated or how to search full history.


Part C — Weak error handling (cppa-cursor-browser #117 item 13).

If CLAUDE_PROJECTS_DIR is inaccessible, list_projects() returns [] silently — search returns
200 with empty results. Empty q returns [] with 200. No max query-length check. No
503 when infrastructure fails. Frontend has no data-error-code for operator debugging.

Goal

One merged PR that:

  1. Adds a local FTS5 search index with live-scan fallback and 30-day default window +
    all_history opt-in (cppa-cursor-browser PR #113 parity).
  2. Makes search scope, window, and result cap discoverable in the UI
    (cppa-cursor-browser #117 Part A).
  3. Returns distinct HTTP status codes from /api/search
    (cppa-cursor-browser #117 Part B).

Scope

Part A — FTS search index (utils/search_index.py, new)

  • Cache directory: ~/.claude-code-chat-browser/ (same root as PR 1 session_summary_cache.sqlite).
  • Index files: search_index.<uuid>.sqlite + search_index.active pointer (atomic swap — cppa
    pattern; safe on Windows).
  • Fingerprint: projects_dir + sorted manifest of every .jsonl as
    (relative_path, mtime_ns, size_bytes) + hash of serialised exclusion rules.
  • Schema: index_meta; sessions(session_id, project_name, title, first_ms, last_ms, file_path); messages_fts FTS5 (session_id, project_name, role, timestamp_ms, text UNINDEXED where noted, tokenize='unicode61').
  • Build: walk projects via list_sessions(); line-by-line JSON decode (no full
    parse_session()); extract text via utils/jsonl_helpers.py / session_peek.py helpers.
  • Query: FTS prefix match → SearchHitDict with ±80-char snippets; apply since_ms window;
    include undated sessions in windowed search (cppa _INCLUDE_UNKNOWN_TIMESTAMPS_IN_WINDOW
    parity); apply exclusion rules on hit candidates (prefer PR 1 summary cache when warm).
  • Lifecycle: ensure_search_index, start_search_index_background (daemon on app.py
    startup), index_is_usable; bypass via CLAUDE_CODE_CHAT_BROWSER_NO_SEARCH_INDEX=1.
  • threading.Lock for build; readers use ?mode=ro.

Part B — Search API (api/search.py)

  • DEFAULT_SEARCH_WINDOW_DAYS = 30, resolve_search_since_ms(all_history, since_days).
  • Query params: all_history=1 / true; since_days=N (validated → 400).
  • Flow: validate q / limit / since_days → try FTS index → fallback to live-scan loop.
  • 400 — empty/whitespace q (SEARCH_EMPTY_QUERY); query > 500 chars
    (SEARCH_QUERY_TOO_LONG); invalid since_days / limit (existing + new).
  • 503 — projects dir inaccessible (SEARCH_PROJECTS_UNAVAILABLE); index locked during
    rebuild (SEARCH_INDEX_UNAVAILABLE on sqlite3.OperationalError).
  • 500 — unexpected errors only; log traceback server-side; no raw exception in JSON.
  • Per-session except Exception: continue stays; add logger.warning for broken files.
  • Happy-path JSON shape unchanged: list[SearchHitDict].

Part C — Search UI (templates/search.html / search JS)

  • Helper text near search input / result banner:
    • Default: last 30 days of indexed sessions; undated sessions may still appear.
    • “Search all history” checkbox → all_history=1.
    • Results capped at {limit}?limit=N (max 500) documented.
  • Truncation warning when results.length === limit.
  • Error paragraph: data-error-code attribute; display error body from JSON on non-200.

Part D — Error codes + tests

  • models/error_codes.py / api/error_codes.py: add SEARCH_EMPTY_QUERY,
    SEARCH_QUERY_TOO_LONG, SEARCH_PROJECTS_UNAVAILABLE, SEARCH_INDEX_UNAVAILABLE.
  • tests/test_search_index.py (new): schema, fingerprint, FTS hit, window filter,
    all_history, NO_SEARCH_INDEX fallback, pointer swap.
  • tests/test_search.py / tests/test_api_routes.py: 400/503 cases; indexed search spy
    (no get_cached_session on warm index); all_history=1 includes old fixture.
  • tests/benchmarks/: indexed-search fixture alongside existing live-scan bench.

Out of scope

  • Semantic / vector search (substring FTS only — same as cppa).
  • Incremental per-line index updates (full rebuild on fingerprint change for v1).
  • workspace=<hash> post-filter (cppa-cursor-browser #117 bonus — N/A for claude).
  • Structured errors for all blueprints — search_bp only.
  • Wiring summary cache into live-scan fallback (optional bonus if time permits).

Follow-up (post-merge)

  • Incremental index update on appended JSONL lines.
  • Search benchmark regression gate in CI (cppa benchmark-suite pattern).

Acceptance Criteria

FTS index

  • utils/search_index.py builds FTS5 index under ~/.claude-code-chat-browser/.
  • Index rebuilds when any .jsonl (path, mtime, size) or rules fingerprint changes.
  • GET /api/search uses index when usable; live-scan fallback when disabled/missing.
  • Default 30-day window; all_history=1 searches full corpus; undated sessions in window.
  • Background index build on startup without blocking first HTTP response.

UI

  • Search page explains 30-day default, undated sessions, all-history opt-in, result cap.
  • Truncation warning when results hit limit.
  • data-error-code on search error paragraph.

API errors

  • Empty q400 SEARCH_EMPTY_QUERY; long q400 SEARCH_QUERY_TOO_LONG.
  • Projects dir unavailable → 503 SEARCH_PROJECTS_UNAVAILABLE.
  • Index locked → 503 SEARCH_INDEX_UNAVAILABLE.
  • Per-session failures logged; no stack traces in JSON.

General

  • tests/test_search_index.py passes; live-scan tests still pass.
  • mypy --strict, full pytest, and ruff pass.
  • PR approved by at least 1 reviewer.

Verification

cd C:\Users\Jasen\CppAliance\claude-code-chat-browser
.\.venv\Scripts\Activate.ps1
pytest tests/test_search_index.py -q
pytest tests/test_search.py tests/test_api_routes.py -q
pytest tests/benchmarks/test_search_bench.py -q
pytest -q
mypy .
ruff check .

Manual:

  1. Open /search — confirm 30-day window + cap helper text and all-history checkbox.
  2. Start app with 50+ sessions; wait for background index build; search — sub-second on warm index.
  3. Restart server; repeat search — still fast (disk index, fingerprint match).
  4. curl ".../api/search?q=" → 400 SEARCH_EMPTY_QUERY.
  5. GET /api/search?q=test (no all_history) — old session outside 30 days absent.
  6. GET /api/search?q=test&all_history=1 — old session present.
  7. CLAUDE_CODE_CHAT_BROWSER_NO_SEARCH_INDEX=1 — live-scan still works.

References

  • cppa-cursor-browser #117 /
    PR #126: search UX + error codes
    — Parts B and C of this PR.
  • cppa-cursor-browser PR #113 /
    cppa-cursor-browser #95: FTS
    index + 30-day window — Part A of this PR.
  • cppa implementation: services/search_index.py, services/search.py,
    tests/test_search_index.py
  • cppa review: team-brain/2026-06/2026-06-23/brad/cppa-cursor-browser PR113 review 2026-06-23.md
  • cppa analog doc (merged #117 scope):
    Doc/Issues/chen-july-week1-monday-search-ux-and-api-errors-github-issue.md
  • Companion PR: chen-july-week1-friday-session-list-summary-cache-github-issue.md PR 1
  • Files: api/search.py, app.py, utils/search_index.py (new), utils/jsonl_helpers.py,
    models/error_codes.py, templates/search.html, tests/test_search_index.py

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions