Skip to content

matthiaskoenig/annotatedb

Repository files navigation

DOI License (LGPL version 3) GitHub version

Matthias König and Jan Grzegorzewski

AnnotateDB logo AnnotateDB

Overview

[^] AnnotateDB (pronounced annotated bee) is a database with web frontend for mapping of annotations found in computational models in biology. AnnotateDB is accessible via https://annotatedb.com.

  • Our mission is to provide mapped annotation resources which simplify annotation of computational models and mapping of entities in such models.
  • Our vision is to provide a single integrated knowledge resource which simplifies mapping between commonly occurring annotations in biological models and data.

AnnotateDB provides a high quality mapping of annotations on each other based on existing resources. Features are

  • annotation mappings from multiple sources
  • support for custom annotation mappings
  • support for qualifiers, i.e., more detailed relationships between annotations
  • support for evidence of annotations, i.e., provenance about the source and method with which the mapping was inferred
  • direct access to the postgres database
  • docker and docker-compose scripts for easy local setup and deployment
  • REST based web interface
  • elastisearch based indexing and search (The elasticsearch end points are still in development and will be part of v0.2.0)

AnnotateDB is accessible under the following licenses

To cite the project use DOI

Installation

[^] AnnotateDB is distributed as docker containers, requiring a working docker and docker-compose installation.

To install AnnotateDB locally use

# clone or pull the latest source code
git clone https://github.com/matthiaskoenig/annotatedb.git
cd annotatedb

# set environment variables
set -a && source .env.local 

# create/rebuild all docker containers
./docker-purge.sh

# restore database
./adb_restore.sh

# elasticsearch indexing
./elasticsearch.sh

This creates the following services

In later releases the installation will be simplified, i.e., prebuild docker containers will be available from dockerhub (see #32).

REST webservice

[^] AnnotateDB provides REST endpoints for querying the database at https://annotatedb.com/api/v1.

AnnotateDB API

Some examples

This will return the information on the collection, in this example for sbo

{
  "namespace":"sbo",
  "miriam":true,
  "name":"Systems Biology Ontology",
  "idpattern":"^SBO:\\d{7}$",
  "urlpattern":"https://identifiers.org/sbo/{$id}"
}

Currently only basic REST endpoints are available. With the introduction of the elasticsearch endpoints in v0.3.0 the REST based search will largely improve. For now users should directly interact with the postgres database to interact with the mappings (see information below).

Postgres database

[^] The postgres database is accessible via

HOST: localhost
PORT: 5434
DB: adb
USER: adb
PASSWORD: adb

The database contains the following main tables (see schema below):

  • adb_collection: A data source or miriam collection for annotation or xref information
  • adb_annotation: The combination of a term from a collection and the given collection
  • adb_mapping: Mapping between annotations, from source annotation to target annotation. The kind of mapping is defined by the qualifier. E.g. the qualifier BQM_IS encodes that the source annotation is the target annotation.
  • adb_evidence: Evidence for the given mapping between annotations.

In addition the materialized view mapping_view is provided which allows easy filtering and search of mapped annotations and annotation synonyms. For most use cases the mapping_view is the table to work with.

Database schema

SQL queries

For instance query the bigg.metabolite for a given chebi identifier via

SELECT source_term FROM mapping_view 
    WHERE (target_term = 'CHEBI:698' AND
           target_namespace = 'chebi' AND 
           source_namespace = 'bigg.metabolite' AND
           qualifier = 'IS');

which results in

('10fthf',)

A more comprehensive list of SQL queries and use cases is provided here with output here.

Data sources

[^] AnnotateDB uses the following data sources:

Collections

identifiers.org identifiers.org

Information on collections is based mainly on identifiers.org. Collections were parsed with sbmlutils.

Mappings

BiGG databae BiGG

A major source of annotation mappings is the BiGG Database with information used from the latest database release. AnnotateDB currently includes BiGG-v1.5.

Release notes

[^] This section provides an overview of major changes and releases

0.2.0

  • security fixes
  • django update (>3.0), elasticsearch update (7.7.1), postgres update (12.3),
  • replacing deprecated django-rest-swagger with drf-yasg

0.1.1

  • bug fixes admin interface
  • bug fixes frontend server
  • enforcing uniqueness of mappings & removing duplicates
  • materialized views
  • detailed postgres examples
  • updated documentation

0.1.0

  • vue frontend
  • bigg mappings import
  • database release files

0.0.1

  • django development server
  • first database schema
  • docker-compose files for backend, database and elasticsearch

Acknowledgements

[^] We acknowledge

for their input and discussions.

© 2019-2020 Matthias König & Jan Grzegorzewski