Skip to content

feat: packages tb enrichment#4243

Open
epipav wants to merge 2 commits into
mainfrom
feat/packages-tb-enrichment
Open

feat: packages tb enrichment#4243
epipav wants to merge 2 commits into
mainfrom
feat/packages-tb-enrichment

Conversation

@epipav

@epipav epipav commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

epipav added 2 commits June 18, 2026 17:12
Signed-off-by: anilb <epipav@gmail.com>
Signed-off-by: anilb <epipav@gmail.com>
Copilot AI review requested due to automatic review settings June 19, 2026 14:55
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e1c746d. Configure here.

coalesce(r.archived, 0) = 1,
'archived',
coalesce(snap.hasSnapshot, 0) = 0,
'active',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing snapshot forces active lifecycle

Medium Severity

When no repoActivitySnapshot row exists for the linked repo, lifecycleLabel is set to active before abandoned, declining, or stable rules run. Repos with stale lastCommitAt or other repo-level signals can be mislabeled until snapshot replication catches up, contradicting the datasource note that a missing snapshot is “no signal.”

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit e1c746d. Configure here.

3,
coalesce(dv.vulnerableDeps, 0) <= 5,
1,
0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No deps get dependency credit

Medium Severity

dependencyHealth awards the maximum five points whenever vulnerableDeps coalesces to zero, including when the package has no packageDependencies join row. That conflates “no dependency data” with “zero vulnerable direct deps,” inflating securitySupplyChainScore while signalCoverageHealth marks dependency_health as blocked.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit e1c746d. Configure here.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a Tinybird-based enrichment layer for OSS packages, producing a new materialized datasource (ossPackages_enriched_ds) that augments ossPackages with derived lifecycle and health scoring signals sourced from repo metadata, activity snapshots, maintainers, releases, vulnerabilities, and dependencies. It also updates the packages-db schema/replication to support the new snapshot feed and to persist enriched fields back into Postgres.

Changes:

  • Add a new Tinybird pipe (ossPackages_enriched.pipe) that computes lifecycle + composite health scoring for packages and materializes results via a scheduled COPY.
  • Add new Tinybird datasources for repoActivitySnapshot and the resulting ossPackages_enriched_ds.
  • Add a packages-db migration to improve indexing, add sequin publication replication for repo_activity_snapshot, and add new enrichment columns to packages.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
services/libs/tinybird/pipes/ossPackages_enriched.pipe Builds package lifecycle/health scoring and signal coverage JSON; materializes into Tinybird on a schedule.
services/libs/tinybird/datasources/repoActivitySnapshot.datasource Defines the repo activity snapshot datasource schema and storage engine settings used by the enrichment pipe.
services/libs/tinybird/datasources/ossPackages_enriched_ds.datasource Defines the enriched OSS packages datasource schema that the pipe writes into.
backend/src/osspckgs/migrations/V1781539311__packages_tables_sequin_updates.sql Adds an index, ensures sequin publication includes repo activity snapshots, and adds enriched columns to packages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +42
ENGINE ReplacingMergeTree
ENGINE_PARTITION_KEY toYear(snapshotAt)
ENGINE_SORTING_KEY repoId
ENGINE_VER snapshotAt
Comment on lines +355 to +357
coalesce(mh.maintainersCount, 0) > 0,
'partial',
'blocked'
Comment on lines +373 to +374
'security_practices',
if(r.branchProtectionEnabled IS NULL, 'partial', 'available'),
END IF;
END$$;

ALTER TABLE public.repo_activity_snapshot REPLICA IDENTITY FULL;
ADD COLUMN IF NOT EXISTS maintainer_health_score smallint,
ADD COLUMN IF NOT EXISTS security_supply_chain_score smallint,
ADD COLUMN IF NOT EXISTS development_activity_score smallint,
ADD COLUMN IF NOT EXISTS signal_coverage_health jsonb;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@epipav what is going to be exactly in the signal_coverage_helath do we know that ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants