ground-or-propose-metpo: residual worklist + Tier-0 env-factor groundings#187
Merged
Conversation
Scan of all communities → reports/ungrounded_community_terms.tsv (60 recurring concepts, freq>=2; freq-1 tail skipped). biological_processes are all GO-grounded already (prior id-label cleanup), so that surface is empty. Tier-0 grounded the clear environmental-factor condition qualities/referents to PATO/CHEBI (Temperature→PATO:0000146, pH→PATO:0001842, Light→PATO:0015013, O2→CHEBI:15379, H2→CHEBI:18276, sulfate→CHEBI:16189, iron→CHEBI:24875) in mappings/community_term_grounding.tsv; skipped 3 non-ontological design/provenance labels. env_factor/downstream_target have no binding slot yet — groundings recorded in the TSV with an "add EnvironmentalFactor.term slot" follow-up. Deferred (scoped in reports/ground_or_propose_metpo_run.md): the ambiguous interaction-predicate middle + a possible METPO v2 proposal cohort. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds the first “Tier-0” output artifacts from a ground-or-propose-metpo run: a frequency-filtered worklist of recurring free-text community terms, a small set of clear environmental-factor groundings to PATO/CHEBI, and a short run summary documenting scope/deferrals and a proposed schema follow-up.
Changes:
- Add
reports/ungrounded_community_terms.tsvenumerating recurring ungrounded terms (freq ≥ 2) across interaction/downstream/env-factor surfaces. - Add
mappings/community_term_grounding.tsvcapturing Tier-0 environmental-factor groundings (and explicitly skipped non-ontological labels). - Add
reports/ground_or_propose_metpo_run.mdsummarizing the run, grounded set, and deferred areas.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| reports/ungrounded_community_terms.tsv | New TSV worklist of recurring ungrounded community terms (freq ≥ 2). |
| reports/ground_or_propose_metpo_run.md | New markdown run summary documenting what was grounded vs deferred and why. |
| mappings/community_term_grounding.tsv | New TSV mapping for Tier-0 env-factor groundings to PATO/CHEBI (plus skipped items). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+12
to
+14
| env_factor defined synthetic community design - - - - - skipped: non-ontological (design descriptor, not an environmental factor) | ||
| env_factor Source environment - - - - - skipped: non-ontological (provenance note) | ||
| env_factor Agricultural application - - - - - skipped: non-ontological (use-case note) |
…he rest Per the #182 decision (drop-generic + mint-rest): DROP — remove the low-value generic obsolete-GO biological_process annotations that the id-label cleanup had remapped to broad/mismatched terms: redox process (-> the MF oxidoreductase activity), organic-substance metabolic/catabolic (-> broad parents). 131 entries across 48 files (removal-only; empty biological_processes headers cleaned). KEEP the meaningful remaps (multi-organism -> interspecies interaction GO:0044419, metal-ion sequestering GO:0140487, anion transporter GO:0008509). scripts/drop_obsolete_go_bp.py matches on both the remapped id AND the obsolete-origin preferred_term so legitimate uses are safe. MINT — proposals/term_requests_communitymech.md: CHEBI requests (lead/zinc sulfide, chromium(III) hydroxide, 3-OH-C14-HSL) + an ENVO request (phyllosphere) for the concepts with no exact existing term (currently on nearest-valid stopgaps), with definitions/parents and the source communities. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
First slice of the
ground-or-propose-metposkill run (Tier-0 mechanical head).Worklist (
reports/ungrounded_community_terms.tsv): scanned all communities → 60 recurring free-text concepts (freq≥2; the freq-1 long tail is skipped).biological_processesare all already GO-grounded (from the recent id↔label cleanup), so that surface is empty.Tier-0 groundings (
mappings/community_term_grounding.tsv): the clear environmental-factor condition qualities/referents → PATO/CHEBI — Temperature→PATO:0000146, pH & Extreme Acidity→PATO:0001842, Light→PATO:0015013, O₂→CHEBI:15379, H₂→CHEBI:18276, sulfate→CHEBI:16189, iron→CHEBI:24875; 3 narrative/design labels skipped as non-ontological.Note:
environmental_factors[].nameanddownstream[].targethave no ontology binding slot in the schema, so these groundings live in the mapping TSV. Follow-up: addEnvironmentalFactor.term(and optionallyInteractionDownstream.target_term) so they can be written into the YAMLs and enforced by the id↔label gate.Deferred (scoped in
reports/ground_or_propose_metpo_run.md): the ambiguous interaction-predicate middle (compound narrative names; primary semantics already ininteraction_type→ v1 METPO cohort) + a possible METPO v2 proposal cohort — a bounded next batch, not run here.