Maintenance: Langfuse SDK by AkhileshNegi · Pull Request #985 · ProjectTech4DevAI/kaapi-backend

AkhileshNegi · 2026-06-30T03:03:13Z

Issue

Closes #407

Checklist

Langfuse keys live in the DB (per org/project), not in env files. Make sure the test org has valid Langfuse keys (public_key, secret_key, host) set up before testing.

1. Chat / Response tracing

Does every AI response still show up in Langfuse?

Send a normal chat request (POST /responses, async) — a trace appears with input, output, and token/cost numbers, and ends cleanly
Send a synchronous request — same trace shows up with input/output
Send a follow-up message — it joins the same session as the first (not a new one)
Force an error (bad LLM call) — error is logged on the trace, nothing left hanging (no orphan span)
Trace details set correctly — name, input, output, session_id, tags, metadata (uses OTel shim set_trace_attributes)

2. Background jobs (Celery)

Do traces still work when responses run as background jobs?

Trigger a background response job (run_response_job, priority 9) — trace shows up via observe_llm_execution
Run a multi-step chain job — langfuse_credentials threaded through ChainContext, each step logged
Cost numbers correct — usage_details are int-only (v4 dropped the unit field), no type errors on token counts

3. Upload a dataset for evals

Can we still push eval datasets to Langfuse?

Upload a dataset (upload_dataset) — appears in Langfuse with all rows (input, expected output, metadata)
Parallel upload (ThreadPoolExecutor) — concurrent create_dataset_item calls succeed, no thread-safety errors
Upload a big dataset — every row lands, none dropped

4. Run an eval + write scores

Do eval runs still create traces and write scores?

Run a batch eval (via cron evaluation_cron_job) — each row gets a trace, cosine score written via create_score
Run a fast eval (run_evaluation_fast, Celery) — dataset-run + scores written
Trace↔dataset-run linkage via api.dataset_run_items.create (replaces v2 dataset_item.observe()) — run items show in Langfuse UI
Cosine scores appear on traces with correct value/name/comment

5. Read scores back

Can we fetch eval scores from Langfuse?

Open an eval run's status (get_evaluation_run_status) — scores load via api.datasets.get_run + api.trace.get(fields="core,io,scores")
Hit refresh/resync (resync_score=true / force=true) — scores re-fetch, fields parsed (name/value/comment/data_type)
Concurrent trace.get (ThreadPoolExecutor) — no thread-safety errors
Cached merge path (crud/evaluations/core.py) returns consistent scores

6. Cron / scheduled jobs

evaluation_cron_job + pending_jobs_cron_job run start-to-finish, no Langfuse errors
Nothing lost when a worker shuts down — flush() called on all paths

Staging-Specific

Multi-tenant: 2+ orgs with different Langfuse keys — traces route to the correct project
Worker logs — format_langfuse_error keeps ApiError logs compact (status_code + body, no full HTTP header dump)
No latency regression on /responses from the new OTel-based client
Test against both Langfuse Pro plan and Hobby plan orgs — confirm tracing/scoring/datasets work on each (plan-tier limits or feature gating don't break the flow)

coderabbitai · 2026-06-30T03:03:22Z

Important

Review skipped

Auto reviews are limited based on label configuration.

🏷️ Required labels (at least one) (1)

ready-for-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2109e151-3f20-4750-bfdc-f0fe0d20799a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch upgrade/langfuse-sdk

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

Port the Langfuse integration from the v2 SDK (2.60.3) to the OTel-based v4 SDK (4.7.1). Scope is the SDK upgrade only; evaluation feature work (score-fetching rewrite, dataset dedup, sample-index fan-out) is deferred to a separate PR. - core/langfuse: rewrite tracer + observe_llm_execution for v4 — explicit per-key clients (multi-tenant), start_observation/LangfuseSpan/ LangfuseGeneration, usage_details, set_trace_attributes via OTel keys, format_langfuse_error for concise ApiError logs. - crud/evaluations/langfuse: port the v2 calls that v4 removed — dataset_item.observe()/trace()/generation() -> start_observation + set_trace_attributes + dataset_run_items.create; langfuse.score -> create_score; trace.get(fields="core,io,scores") to avoid full-trace timeouts. - tests: v4-adapt tracer + crud langfuse suites; add format_error tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-30T04:10:38Z

OpenAPI changes ⚪ No API surface changes

Note

This PR does not modify the API contract.

_{main ↔ fba54265 · generated by oasdiff}

codecov · 2026-06-30T04:20:43Z

Codecov Report

❌ Patch coverage is 97.96748% with 5 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
backend/app/crud/evaluations/langfuse.py	88.46%	3 Missing ⚠️
backend/app/core/langfuse/langfuse.py	96.61%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

The format_langfuse_error helper was defined in the SDK v4 upgrade but had no production callers, so Langfuse exception logs still rendered the full ApiError string (every HTTP response header). Wire the helper into all Langfuse-related except blocks across the tracer, observe_llm_execution, and the evaluations CRUD so logs keep only status_code and body. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

AkhileshNegi marked this pull request as ready for review June 30, 2026 03:05

AkhileshNegi force-pushed the upgrade/langfuse-sdk branch from 9dd478b to f7e6f30 Compare June 30, 2026 04:09

AkhileshNegi linked an issue Jun 30, 2026 that may be closed by this pull request

Langfuse : Update to latest #407

Open

AkhileshNegi self-assigned this Jun 30, 2026

AkhileshNegi added the maintenance label Jun 30, 2026

AkhileshNegi added 3 commits June 30, 2026 12:34

fixing async traces issues

edf1397

Merge branch 'main' into upgrade/langfuse-sdk

88774d7

Merge branch 'main' into upgrade/langfuse-sdk

e03611a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Maintenance: Langfuse SDK#985

Maintenance: Langfuse SDK#985
AkhileshNegi wants to merge 5 commits into
mainfrom
upgrade/langfuse-sdk

AkhileshNegi commented Jun 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

AkhileshNegi commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Checklist

1. Chat / Response tracing

2. Background jobs (Celery)

3. Upload a dataset for evals

4. Run an eval + write scores

5. Read scores back

6. Cron / scheduled jobs

Staging-Specific

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAPI changes ⚪ No API surface changes

Uh oh!

codecov Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AkhileshNegi commented Jun 30, 2026 •

edited

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

github-actions Bot commented Jun 30, 2026 •

edited

Loading

codecov Bot commented Jun 30, 2026 •

edited

Loading