Skip to content

Releases: PavlidisLab/Gemma

1.31.9

10 Jul 05:34
Compare
Choose a tag to compare
Tag hotfix

1.31.8

26 Jun 02:41
Compare
Choose a tag to compare

Changeset

  • fix various issues with platform and vector merging
  • new endpoint exposing batch information and effect (reserved for curators)
  • quantitation type can be retrieved by name in the REST API
  • improvement for creating and deleting vectors in batch
  • improve serialization of interaction and continuous factors when producing result sets in TSV

Improved encoding of interactions and continuous factors in result sets TSV output

Although rarely used, Gemma's linear model can handle continuous factors. The TSV output not fully supports this.

When we produce a TSV output for a result set, we need to encode three types of contrasts: single factor, interaction of two factors and continuous factors. Those are encoded as follows:

  • contrast_{fv_id}_{key} for a single factor
  • contrast_{fv_id1}_{fv_id2}_{key} for an interaction between two factors
  • contrast_{key} for a continuous factor

where {key} is one of coefficient, log2fc tstat or pvalue.

Gemma is inherently limited to a single continuous factor per result set. If that were to change, we would have to account for this by adjusting the encoding.

Retrieve differential expression results across datasets

The RESTful API has been bumped to 2.8.0 and features a new endpoint for retrieving DE results for a given gene across all datasets, subsets and result sets curated in Gemma.

Results can be filtered at the dataset-level with the usual query and filter parameters and paginated with offset and limit. They can also be filtered by corrected P-value using threshold to reject results with a poor fit for the given gene.

GET /datasets/analyses/differential/results/taxa/human/genes/BRCA1 HTTP/1.1

The endpoint can also be requested to produce a tabular output by passing Accept: text/tab-separated-values.

GET /datasets/analyses/differential/results/taxa/{taxon}/genes/{gene} HTTP/1.1
Accept: text/tab-separated-values

Retrieve raw vectors with quantitation type names

It is now possible to use a name for retrieving vectors for a given experiment.

GET /datasets/{dataset}/data/raw?quantitationType={name}

Common quantitation type name for raw data vectors are:

  • log2cpm
  • counts
  • rpkm
  • rma value
  • value

The first three are used for RNA-Seq data.

1.31.6

16 May 21:02
Compare
Choose a tag to compare
Tag hotfix

1.31.3

03 Apr 23:04
Compare
Choose a tag to compare

This patch release brings substantial performance improvements for GemBrow and much more!

  • advanced search syntax with Lucene
  • numerous improvements to the search backend
  • batch loading and parameter padding to reuse prepared statements as much as possible
  • query optimization tailored for GemBrow
  • monitoring of the local task and database connection pools with Micrometer and JMX
  • improved batch confound detection for small sample sizes by @ppavlidis

Advanced search syntax

We used to have this feature before migrating to Hibernate Search. It's now fully back on! It can be used in the search interface, in GemBrow or via the REST API.

image

Query optimization

We've introduced a bunch of query utilities for batching and padding Hibernate parameter lists. This is a temporary solution until we migrate to Hibernate 5+ which supports this feature natively. Parameter padding and batching reduces the number of prepared statements that needs to be managed by Hibernate.

1.31.2

12 Mar 23:22
Compare
Choose a tag to compare
Tag hotfix

1.31.0

23 Jan 18:38
Compare
Choose a tag to compare
  • introduce statements in FactorValues by @arteymix
  • improved parsing of GEO metadata for populating sample characteristics by @ppavlidis
  • replace Compass with Hibernate Search
  • new tool for finding obsolete terms by @ppavlidis
  • remove Guava
  • improve permission masking and jointures in AclQueryUtils
  • migrated the CI to Jenkins Pipeline
  • gradually getting rid of Apache Configuration 2 with built-in Spring support for property placeholders

FactorValue semantics

The main feature this release bring is the introduction of semantics in factor values. Previously, a factor value was annotated with a simple bag of ontology terms. In some cases, annotations were ambiguous and made it difficult to interpret the experimental design.

An example would be a treatment with two compounds and two doses. Which dose applies to which compound? This is resolved by creating two statements: "compound A delivered at dose B", "compound C delivered at dose D".

REST API-wise, we now display statements alongside old-style characteristics.

image

Our factor values are also made available in OWL/RDF. For example, the FactorValue #138393 from GSE10721 can be retrieved with:

curl -H Accept:application/rdf+xml https://gemma.msl.ubc.ca/ont/TGFVO/138393

Full Changelog: 1.30.6...1.31.0

1.30.6

30 Nov 22:44
Compare
Choose a tag to compare

This is a small release that ensures database compatibility with the 1.31 series.

1.30.5

30 Nov 22:44
Compare
Choose a tag to compare
Tag hotfix

1.30.4

02 Nov 23:30
Compare
Choose a tag to compare
Tag hotfix

1.30.3

19 Oct 00:12
Compare
Choose a tag to compare
Tag hotfix