Skip to content

Translator Curated Query Service

karafecho edited this page Aug 6, 2024 · 7 revisions

Description

The Translator Curated Query Service (CQS) was conceptualized by the Translator Clinical Data Committee (TCDC) in Fall 2022. The goal is to create a skeletal ARA that initially will support the TCDC's MVP1 workflow on rare pulmonary disease, e.g., MVP1 Template 1 (clinical-kps), but the intent is for the CQS to provide a general model and approach for other teams, committees, working groups, and external users who wish to contribute to the Translator ecosystem. The development and implementation work is being supported by the Translator Standards and Reference Implementation (SRI) core, with Jason Reilly serving as lead developer. Plans for long-term maintenance are TBD.

What It Does

  1. An SRI Service that provides ARA-like capabilities:

    • generation of ‘predicted’ edges in response to creative queries - based on customizable inference rules

    • linking predictions to their supporting aux graphs

    • attachment of provenance metadata and scores to results

  2. Inference specifications are defined as TRAPI templates, which serve as config files for a custom reasoning service / workflow

    • The specifications include a required field to primary and aggregator knowledge sources (e.g., "resource_id": "infores:biothings-explorer", "resource_role": "primary_knowledge_source") and optional fields to specify, for example, workflow parameters such as an "allow list"
  3. Scoring of individual workflow templates can be customized

    • e.g., ARAGORN’s scoring/ranking algorithm, OpenPredict’s prediction score

    • Scoring within a result is in descending order, based on the analysis score. Scoring across results is currently based on the max analysis score, in descending order

What it Enables

  1. Supports manually-defined, SMuRF- and SME-evaluated inferred workflows to be contributed by any team or working group, or even external groups; each workflow is structured as a valid TRAPI query and serves as a CQS template

  2. Provides a simple mechanism through which KPs can apply their expertise /resources to specify how their data are to be used for inference

    • This can enable a ”conservative ingest” paradigm - where KPs ingest what sources directly assert and rely on CQS services to generate desired inferences based on this more foundational knowledge
    • For example, the CQS is supporting Biolink Model's "treats" refactoring effort such that KP (e.g., Multiomics Clinical Trials KP) can report the precise entity relationships reported by a given source (e.g., clinicaltrials.gov, biolink:in_clinical_trials_for) and the CQS can then generate a "treats" edge based on a set of rules for when such edges can be elevated to "treats" status, as defined by the KP team, with the CQS pointing to the original edge as an aux graph; in this example, the CQS predicted "treats" edge refers to biolink:knowledge_level prediction, biolink:agent_type computational_model, and the primary edge from the KP refers to biolink:knowledge_level knowledge_assertion, biolink:agent_type manual_agent, biolink:max_research_phase clinical_trial_phase_4
  3. Allows KP teams such as OpenPredict or Multiomics to avoid dealing with ARA functions such as aux graphs, ARS registration, merging, scoring, normalizing, adding literature co-occurrence

  4. Facilitates consistent specification and implementation of inference rules, by providing a centralized and transparent place to define, align, and collaborate on inference rules

How to Submit a New Template and into the Translator Pipeline

  1. Develop a set of "rules" specifying when a particular KP can contribute to an inferred MVP query.
  2. Apply the rules in (1) via a valid TRAPI query that can serve as a CQS template.
    • Include required specifications for primary and aggregator knowledge sources
    • Add any additional specifications such as attribute constraints or workflow parameters such as an "allowlist"
  3. Test the CQS template by direct query of the Workflow Runner.
  4. Create a branch in the CQS repo.
    • Add a new template folder within CQS/templates.
    • Within that folder, add a thoroughly descriptive README with a POC and select CURIES to be used for development and testing. The CURIES should be associated with test assets that the POC has contributed to the test assets repo, using this G-sheet.
    • Also add a new CQS template structured as a valid TRAPI.
    • Create a PR.
  5. The new CQS template will then be deployed to DEV, thus entering the Translator pipeline.
  6. After the CQS is deployed to CI, it will be picked up by the Information Radiator for automated testing. The POC for a given CQS template is responsible for monitoring the testing results.

Architectural Overview

image

Team contacts

Jason Reilly (Exposures Provider)

Kara Fecho (Exposures Provider, SRI)

Max Wang (Exposures Provider, Ranking Agent, SRI)

Source code

https://github.com/TranslatorSRI/CQS/

Clone this wiki locally