claude-code-chat-browser: FTS search index, search UX, and distinct /api/search error codes

## Calendar Day

Monday, July 7, 2026 (**PR 2 of 2**)

## Planned Effort

**8 story points** (Medium–High)

**Companion PR:** Fri PR 1 (session list summary cache) — independent files; merge PR 1 first
(shared cache dir under `~/.claude-code-chat-browser/`).

## Problem

Three search gaps — the same bundle cppa shipped on
[cppa-cursor-browser PR #113](https://github.com/cppalliance/cppa-cursor-browser/pull/113) (FTS
half, out of scope for
[cppa-cursor-browser #95](https://github.com/cppalliance/cppa-cursor-browser/issues/95)) plus
[cppa-cursor-browser #117](https://github.com/cppalliance/cppa-cursor-browser/issues/117)


---

**Part A — Search throughput** ([cppa-cursor-browser PR #113](https://github.com/cppalliance/cppa-cursor-browser/pull/113) FTS analog).

`GET /api/search` is a brute-force substring scan: for every project, every `.jsonl` session
file, it calls `get_cached_session()` (full JSONL parse) and walks every message in memory.
Complexity is **O(projects × sessions × messages)** per query.
[claude-code-chat-browser #82](https://github.com/cppalliance/claude-code-chat-browser/issues/82)
only caches parsed sessions in memory — it does not reduce work on the first search after restart. On a power-user
install (20+ projects, 300+ sessions), a single search can take tens of seconds.
`tests/benchmarks/test_search_bench.py` exercises live-scan only; there is no indexed path.

cppa solved this with a derived `search_index.sqlite` (SQLite **FTS5**), mtime-keyed fingerprint
invalidation, background rebuild, and live-scan fallback. Claude has flat `.jsonl` files under
`~/.claude/projects/` — the index must be designed for line-by-line JSONL extraction, not
ported verbatim from cppa's `services/search_index.py`.

---

**Part B — Search behavior invisible to users** ([cppa-cursor-browser #117](https://github.com/cppalliance/cppa-cursor-browser/issues/117) item 12).

Results are capped at 50 by default (up to 500 via `?limit=N`). There is no default time window
(cppa defaults to 30 days + `all_history` opt-in). Neither constraint is explained in the UI.
Users cannot tell whether results are truncated or how to search full history.

---

**Part C — Weak error handling** ([cppa-cursor-browser #117](https://github.com/cppalliance/cppa-cursor-browser/issues/117) item 13).

If `CLAUDE_PROJECTS_DIR` is inaccessible, `list_projects()` returns `[]` silently — search returns
200 with empty results. Empty `q` returns `[]` with 200. No max query-length check. No
`503` when infrastructure fails. Frontend has no `data-error-code` for operator debugging.

## Goal

One merged PR that:
1. Adds a local FTS5 search index with live-scan fallback and 30-day default window +
   `all_history` opt-in ([cppa-cursor-browser PR #113](https://github.com/cppalliance/cppa-cursor-browser/pull/113) parity).
2. Makes search scope, window, and result cap discoverable in the UI
   ([cppa-cursor-browser #117](https://github.com/cppalliance/cppa-cursor-browser/issues/117) Part A).
3. Returns distinct HTTP status codes from `/api/search`
   ([cppa-cursor-browser #117](https://github.com/cppalliance/cppa-cursor-browser/issues/117) Part B).

## Scope

### Part A — FTS search index (`utils/search_index.py`, new)

- Cache directory: `~/.claude-code-chat-browser/` (same root as PR 1 `session_summary_cache.sqlite`).
- Index files: `search_index.<uuid>.sqlite` + `search_index.active` pointer (atomic swap — cppa
  pattern; safe on Windows).
- **Fingerprint:** `projects_dir` + sorted manifest of every `.jsonl` as
  `(relative_path, mtime_ns, size_bytes)` + hash of serialised exclusion rules.
- **Schema:** `index_meta`; `sessions(session_id, project_name, title, first_ms, last_ms,
  file_path)`; `messages_fts` FTS5 `(session_id, project_name, role, timestamp_ms, text
  UNINDEXED where noted, tokenize='unicode61')`.
- **Build:** walk projects via `list_sessions()`; line-by-line JSON decode (no full
  `parse_session()`); extract text via `utils/jsonl_helpers.py` / `session_peek.py` helpers.
- **Query:** FTS prefix match → `SearchHitDict` with ±80-char snippets; apply `since_ms` window;
  include undated sessions in windowed search (cppa `_INCLUDE_UNKNOWN_TIMESTAMPS_IN_WINDOW`
  parity); apply exclusion rules on hit candidates (prefer PR 1 summary cache when warm).
- **Lifecycle:** `ensure_search_index`, `start_search_index_background` (daemon on `app.py`
  startup), `index_is_usable`; bypass via `CLAUDE_CODE_CHAT_BROWSER_NO_SEARCH_INDEX=1`.
- `threading.Lock` for build; readers use `?mode=ro`.

### Part B — Search API (`api/search.py`)

- `DEFAULT_SEARCH_WINDOW_DAYS = 30`, `resolve_search_since_ms(all_history, since_days)`.
- Query params: `all_history=1` / `true`; `since_days=N` (validated → 400).
- Flow: validate `q` / `limit` / `since_days` → try FTS index → fallback to live-scan loop.
- **400** — empty/whitespace `q` (`SEARCH_EMPTY_QUERY`); query > 500 chars
  (`SEARCH_QUERY_TOO_LONG`); invalid `since_days` / `limit` (existing + new).
- **503** — projects dir inaccessible (`SEARCH_PROJECTS_UNAVAILABLE`); index locked during
  rebuild (`SEARCH_INDEX_UNAVAILABLE` on `sqlite3.OperationalError`).
- **500** — unexpected errors only; log traceback server-side; no raw exception in JSON.
- Per-session `except Exception: continue` stays; add `logger.warning` for broken files.
- Happy-path JSON shape unchanged: `list[SearchHitDict]`.

### Part C — Search UI (`templates/search.html` / search JS)

- Helper text near search input / result banner:
  - Default: last **30 days** of indexed sessions; undated sessions may still appear.
  - “Search all history” checkbox → `all_history=1`.
  - Results capped at `{limit}` — `?limit=N` (max 500) documented.
- Truncation warning when `results.length === limit`.
- Error paragraph: `data-error-code` attribute; display `error` body from JSON on non-200.

### Part D — Error codes + tests

- **`models/error_codes.py` / `api/error_codes.py`:** add `SEARCH_EMPTY_QUERY`,
  `SEARCH_QUERY_TOO_LONG`, `SEARCH_PROJECTS_UNAVAILABLE`, `SEARCH_INDEX_UNAVAILABLE`.
- **`tests/test_search_index.py` (new):** schema, fingerprint, FTS hit, window filter,
  `all_history`, `NO_SEARCH_INDEX` fallback, pointer swap.
- **`tests/test_search.py` / `tests/test_api_routes.py`:** 400/503 cases; indexed search spy
  (no `get_cached_session` on warm index); `all_history=1` includes old fixture.
- **`tests/benchmarks/`:** indexed-search fixture alongside existing live-scan bench.

### Out of scope

- Semantic / vector search (substring FTS only — same as cppa).
- Incremental per-line index updates (full rebuild on fingerprint change for v1).
- `workspace=<hash>` post-filter ([cppa-cursor-browser #117](https://github.com/cppalliance/cppa-cursor-browser/issues/117) bonus — N/A for claude).
- Structured errors for all blueprints — `search_bp` only.
- Wiring summary cache into live-scan fallback (optional bonus if time permits).

### Follow-up (post-merge)

- Incremental index update on appended JSONL lines.
- Search benchmark regression gate in CI (cppa benchmark-suite pattern).

## Acceptance Criteria

### FTS index

- [ ] `utils/search_index.py` builds FTS5 index under `~/.claude-code-chat-browser/`.
- [ ] Index rebuilds when any `.jsonl` `(path, mtime, size)` or rules fingerprint changes.
- [ ] `GET /api/search` uses index when usable; live-scan fallback when disabled/missing.
- [ ] Default 30-day window; `all_history=1` searches full corpus; undated sessions in window.
- [ ] Background index build on startup without blocking first HTTP response.

### UI

- [ ] Search page explains 30-day default, undated sessions, all-history opt-in, result cap.
- [ ] Truncation warning when results hit limit.
- [ ] `data-error-code` on search error paragraph.

### API errors

- [ ] Empty `q` → **400** `SEARCH_EMPTY_QUERY`; long `q` → **400** `SEARCH_QUERY_TOO_LONG`.
- [ ] Projects dir unavailable → **503** `SEARCH_PROJECTS_UNAVAILABLE`.
- [ ] Index locked → **503** `SEARCH_INDEX_UNAVAILABLE`.
- [ ] Per-session failures logged; no stack traces in JSON.

### General

- [ ] `tests/test_search_index.py` passes; live-scan tests still pass.
- [ ] `mypy --strict`, full `pytest`, and `ruff` pass.
- [ ] PR approved by at least 1 reviewer.

## Verification

```powershell
cd C:\Users\Jasen\CppAliance\claude-code-chat-browser
.\.venv\Scripts\Activate.ps1
pytest tests/test_search_index.py -q
pytest tests/test_search.py tests/test_api_routes.py -q
pytest tests/benchmarks/test_search_bench.py -q
pytest -q
mypy .
ruff check .
```

Manual:

1. Open `/search` — confirm 30-day window + cap helper text and all-history checkbox.
2. Start app with 50+ sessions; wait for background index build; search — sub-second on warm index.
3. Restart server; repeat search — still fast (disk index, fingerprint match).
4. `curl ".../api/search?q="` → 400 `SEARCH_EMPTY_QUERY`.
5. `GET /api/search?q=test` (no `all_history`) — old session outside 30 days absent.
6. `GET /api/search?q=test&all_history=1` — old session present.
7. `CLAUDE_CODE_CHAT_BROWSER_NO_SEARCH_INDEX=1` — live-scan still works.

## References

- [cppa-cursor-browser #117](https://github.com/cppalliance/cppa-cursor-browser/issues/117) /
  [PR #126](https://github.com/cppalliance/cppa-cursor-browser/pull/126): search UX + error codes
  — Parts B and C of this PR.
- [cppa-cursor-browser PR #113](https://github.com/cppalliance/cppa-cursor-browser/pull/113) /
  [cppa-cursor-browser #95](https://github.com/cppalliance/cppa-cursor-browser/issues/95): FTS
  index + 30-day window — Part A of this PR.
- **cppa implementation:** `services/search_index.py`, `services/search.py`,
  `tests/test_search_index.py`
- **cppa review:** `team-brain/2026-06/2026-06-23/brad/cppa-cursor-browser PR113 review 2026-06-23.md`
- **cppa analog doc (merged #117 scope):**
  `Doc/Issues/chen-july-week1-monday-search-ux-and-api-errors-github-issue.md`
- **Companion PR:** `chen-july-week1-friday-session-list-summary-cache-github-issue.md` [PR 1](https://github.com/cppalliance/claude-code-chat-browser/issues/109)
- Files: `api/search.py`, `app.py`, `utils/search_index.py` (new), `utils/jsonl_helpers.py`,
  `models/error_codes.py`, `templates/search.html`, `tests/test_search_index.py`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

claude-code-chat-browser: FTS search index, search UX, and distinct /api/search error codes #110

Calendar Day

Planned Effort

Problem

Goal

Scope

Part A — FTS search index (`utils/search_index.py`, new)

Part B — Search API (`api/search.py`)

Part C — Search UI (`templates/search.html` / search JS)

Part D — Error codes + tests

Out of scope

Follow-up (post-merge)

Acceptance Criteria

FTS index

UI

API errors

General

Verification

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

claude-code-chat-browser: FTS search index, search UX, and distinct /api/search error codes #110

Description

Calendar Day

Planned Effort

Problem

Goal

Scope

Part A — FTS search index (utils/search_index.py, new)

Part B — Search API (api/search.py)

Part C — Search UI (templates/search.html / search JS)

Part D — Error codes + tests

Out of scope

Follow-up (post-merge)

Acceptance Criteria

FTS index

UI

API errors

General

Verification

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Part A — FTS search index (`utils/search_index.py`, new)

Part B — Search API (`api/search.py`)

Part C — Search UI (`templates/search.html` / search JS)