Skip to content

Commit

Permalink
documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
ppavlidis committed Aug 26, 2022
1 parent c1fd962 commit c120447
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 11 deletions.
32 changes: 24 additions & 8 deletions R/convenience.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ setGemmaUser <- function(username = NULL, password = NULL) {

#' Gemma platform annotations
#'
#' Gets Gemma's platform annotation files that can be accessed from https://gemma.msl.ubc.ca/annots/
#' Gets Gemma's platform annotations including mappings of microarray probes to genes.
#'
#' @param platform A platform identifier @seealso getPlatforms
#' @param annotType Which GO terms should the output include
Expand Down Expand Up @@ -131,9 +131,9 @@ memgetPlatformAnnotation <- function(platform,
}


#' Dataset expression and design
#' Access gene expression data and metadata
#'
#' Combines various endpoint calls to return an annotated Bioconductor-compatible
#' Return an annotated Bioconductor-compatible
#' data structure of the queried dataset, including expression data and
#' the experimental design.
#'
Expand Down Expand Up @@ -251,14 +251,30 @@ getDatasetTidy <- function(dataset, filter = FALSE, memoised = getOption("gemma
dplyr::rename(sample = .data$Sample, probe = .data$Probe)
}

#' Dataset differential expression
#' Retrieve differential expression results
#'
#' Retrieves the differential expression resultSet(s) associated with the dataset.
#' If there is more than one resultSet, use [getDatasetResultSets()] or [getDatasetDEA()] to see
#' the options and get the ID you want. Alternatively, you can query the resultSet
#' Retrieves the differential expression result set(s) associated with the dataset.
#' If there is more than one result set, use [getDatasetResultSets()] or [getDatasetDEA()] to see
#' the options and get the ID you want. Alternatively, you can query the resultset
#' directly if you know its ID beforehand.
#'
#' In Gemma each result set corresponds to
#' the estimated effects associated with a single factor in the design, and each can have multiple contrasts (for each level compared to baseline).
#' Thus a dataset with a 2x3 factorial design will have two result sets, one of which will have one contrast, and one having two contrasts.
#'
#' Methodology for differential expression is explained in \href{https://doi.org/10.1093/database/baab006}{Curation of over 10000 transcriptomic studies to enable data reuse}. Specifically, "differential expression analysis is performed on the dataset based on the annotated experimental design. In cases where certain terms are used (e.g. ‘reference substance role’ (OBI_0000025), ‘reference subject role’ (OBI_0000220), ‘initial time point’ (EFO_0004425), ‘wild type genotype’ (EFO_0005168), ‘control’ (EFO_0001461), etc.), Gemma automatically assigns these conditions as the baseline control group; in absence of a clear control condition, a baseline is arbitrarily selected. To perform the analysis, a generalized linear model is fit to the data for each platform element (probe/gene). For RNA-seq data, we use weighted regression, using an in-house implementation of the voom algorithm to compute weights from the mean–variance relationship of the data. Contrasts of each condition are then compared to the selected baseline. In datasets where the ‘batch’ factor is confounded with another factor, separate differential expression analyses are performed on subsets of the data; the subsets being determined by the levels of the confounding factor."
#' The methodology for differential expression is explained in \href{https://doi.org/10.1093/database/baab006}{Curation of over 10000 transcriptomic studies to enable data reuse}.
#' Briefly, differential expression analysis is performed on the dataset based on the annotated
#' experimental design with up two three potentially nested factors.
#' Gemma attempts to automatically assign baseline conditions for each factor.
#' In the absence of a clear control condition, a baseline is arbitrarily selected.
#' A generalized linear model with empirical Bayes shrinkage of t-statistics is fit to the data
#' for each platform element (probe/gene) using an implementation of the limma algorithm. For RNA-seq data,
#' we use weighted regression, applying the
#' voom algorithm to compute weights from the mean–variance relationship of the data.
#' Contrasts of each condition are then computed compared to the selected baseline.
#' In some situations, Gemma will split the data into subsets for analysis.
#' A typical such situation is when a ‘batch’ factor is present and confounded with another factor,
#' the subsets being determined by the levels of the confounding factor.
#'
#' @param dataset A dataset identifier.
#' @param resultSet A resultSet identifier.
Expand Down
9 changes: 6 additions & 3 deletions R/gemma.R.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#' gemma.R package: Access curated gene expression data
#' gemma.R package: Access curated gene expression data and differential expression analyses
#'
#' This package containts low- and high-level wrappers for Gemma's RESTful API
#' This package contains wrappers and convenience function for Gemma's RESTful API
#' that enable access to curated expression and differential expression data
#' from over 10,000 published studies. Gemma is a web site, database and a set
#' from over 15,000 published studies (as of mid-2022). Gemma (https://gemma.msl.ubc.ca) is a web site, database and a set
#' of tools for the meta-analysis, re-use and sharing of genomics data,
#' currently primarily targeted at the analysis of gene expression profiles.
#'
Expand All @@ -13,6 +13,9 @@
#' \item Gene endpoints: Access information about specific genes.
#' }
#'
#' Most users will want to start with the high-level functions getDataset, getDatasetDE and getPlatformAnnotations.
#' Additional lower-level methods are available that directly map to the Gemma RESTful API methods.
#'
#' For more information and detailed usage instructions check the
#' \href{https://pavlidislab.github.io/gemma.R/index.html}{README}, the
#' \href{https://pavlidislab.github.io/gemma.R/reference/index.html}{function reference}
Expand Down

0 comments on commit c120447

Please sign in to comment.