Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README to include list of popular software included in docker image #8745

Merged
merged 5 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ releases of the toolkit.
* [Requirements](#requirements)
* [Quick Start Guide](#quickstart)
* [Downloading GATK4](#downloading)
* [Tools Included in Docker Image](#dockerSoftware)
* [Building GATK4](#building)
* [Running GATK4](#running)
* [Passing JVM options to gatk](#jvmoptions)
Expand Down Expand Up @@ -115,6 +116,34 @@ You can download and run pre-built versions of GATK4 from the following places:
* You can download a GATK4 docker image from [our dockerhub repository](https://hub.docker.com/r/broadinstitute/gatk/). We also host unstable nightly development builds on [this dockerhub repository](https://hub.docker.com/r/broadinstitute/gatk-nightly/).
* Within the docker image, run gatk commands as usual from the default startup directory (/gatk).

### <a name="dockerSoftware">Tools Included in Docker Image</a>

Our docker image contains the following bioinformatics tools, which can be run by invoking the tool name from the command line:
* bedtools (v2.30.0)
* samtools (1.13)
* bcftools (1.13)
* tabix (1.13+ds)

We also include an installation of Python3 (3.6.10) with the following popular packages included:
* numpy
* scipy
* tensorflow
* pymc3
* keras
* scikit-learn
* matplotlib
* pandas
* biopython
* pyvcf
* pysam

We also include an installation of R (3.6.2) with the following popular packages included:
* data.table
* dplyr
* ggplot2

For more details on system packages, see the GATK [Base Dockerfile](scripts/docker/gatkbase/Dockerfile) and for more details on the Python3/R packages, see the [Conda environment setup file](scripts/gatkcondaenv.yml.template). Versions for the Python3/R packages can be found there.

## <a name="building">Building GATK4</a>

* **To do a full build of GATK4, first clone the GATK repository using "git clone", then run:**
Expand Down
1 change: 1 addition & 0 deletions scripts/docker/gatkbase/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Using OpenJDK 17
# This Dockerfile does not require any files that are in the GATK4 repo.
# NOTE: If you update the ubuntu version make sure to update the samtools/bcftools/bedtools versions in the README
FROM ubuntu:22.04

# Avoid interactive prompts during apt installs/upgrades
Expand Down
2 changes: 2 additions & 0 deletions scripts/gatkcondaenv.yml.template
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
# used by the testGATKPythonEnvironmentPackagePresent test in PythonEnvironmentIntegrationTest needs to be updated
# to reflect the changes.
#
# NOTE: If you update any of the packages below, please make sure the main README is up to date with the latest package install information.
#
name: $condaEnvName
channels:
# if channels other than conda-forge are added and the channel order is changed (note that conda channel_priority is currently set to flexible),
Expand Down
Loading