Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the GATK base image to a newer LTS ubuntu release #8610

Merged
merged 13 commits into from
Dec 12, 2023
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ARG BASE_DOCKER=broadinstitute/gatk:gatkbase-3.1.0
ARG BASE_DOCKER=us.gcr.io/broad-dsde-methods/gatk-base-image-staging-area:3.2.0rc7

# stage 1 for constructing the GATK zip
FROM ${BASE_DOCKER} AS gradleBuild
Expand Down Expand Up @@ -93,8 +93,6 @@ RUN conda env create -n gatk -f /gatk/gatkcondaenv.yml && \
echo "source activate gatk" >> /gatk/gatkenv.rc && \
echo "source /gatk/gatk-completion.sh" >> /gatk/gatkenv.rc && \
conda clean -afy && \
find /opt/miniconda/ -follow -type f -name '*.a' -delete && \
find /opt/miniconda/ -follow -type f -name '*.pyc' -delete && \
rm -rf /root/.cache/pip

CMD ["bash", "--init-file", "/gatk/gatkenv.rc"]
Expand Down
43 changes: 25 additions & 18 deletions scripts/docker/gatkbase/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
# Using OpenJDK 17
# This Dockerfile does not require any files that are in the GATK4 repo.
FROM ubuntu:18.04
FROM ubuntu:22.04

# Avoid interactive prompts during apt installs/upgrades
ENV DEBIAN_FRONTEND noninteractive

#### Basic image utilities
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y --no-install-recommends \
RUN apt update && \
apt full-upgrade -y && \
apt install -y --no-install-recommends \
python3 \
python3-pip \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this system pip required for anything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it might be useful to have a pip in the base environment, but I can remove it if you think it might cause problems?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fine, just curious!

wget \
curl \
bc \
Expand All @@ -15,14 +19,17 @@ RUN apt-get update && \
less \
bedtools \
samtools \
bcftools \
tabix \
git \
gpg-agent \
build-essential \
openjdk-17-jdk \
vim \
software-properties-common && \
apt-get -y clean && \
apt-get -y autoclean && \
apt-get -y autoremove && \
apt -y clean && \
apt -y autoclean && \
apt -y autoremove && \
rm -rf /var/lib/apt/lists/*

RUN java -version
Expand All @@ -31,11 +38,11 @@ RUN java -version
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" \
| tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg \
| apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - && \
apt-get update -y && \
apt-get install -y --no-install-recommends google-cloud-cli && \
apt-get -y clean && \
apt-get -y autoclean && \
apt-get -y autoremove && \
apt update -y && \
apt install -y --no-install-recommends google-cloud-cli && \
apt -y clean && \
apt -y autoclean && \
apt -y autoremove && \
rm -rf /var/lib/apt/lists/* && \
# Remove the anthos cli tool and related files since they are very large and we / anyone using the docker are unlikely to use them
# Remove the bundled python because we have python installed separately
Expand All @@ -55,16 +62,16 @@ ENV JAVA_LIBRARY_PATH /usr/lib/jni

# Install miniconda
ENV DOWNLOAD_DIR /downloads
ENV CONDA_URL https://repo.anaconda.com/miniconda/Miniconda3-py310_23.1.0-1-Linux-x86_64.sh
ENV CONDA_MD5 = "32d73e1bc33fda089d7cd9ef4c1be542616bd8e437d1f77afeeaf7afdb019787"
ENV CONDA_URL https://repo.anaconda.com/miniconda/Miniconda3-py310_23.10.0-1-Linux-x86_64.sh
ENV CONDA_SHA256 "c7a34df472feb69805b64df6e8db58363c5ccab41cd3b40b07e3e6dfb924359a"
ENV CONDA_PATH /opt/miniconda
ENV PATH $CONDA_PATH/bin:$PATH
RUN mkdir $DOWNLOAD_DIR && \
wget -nv -O $DOWNLOAD_DIR/miniconda.sh $CONDA_URL && \
test "`md5sum $DOWNLOAD_DIR/miniconda.sh | awk -v FS=' ' '{print $1}'` = $CONDA_MD5" && \
test "$(sha256sum $DOWNLOAD_DIR/miniconda.sh | awk -v FS=' ' -v ORS='' '{print $1}')" = "$CONDA_SHA256" && \
bash $DOWNLOAD_DIR/miniconda.sh -p $CONDA_PATH -b && \
rm $DOWNLOAD_DIR/miniconda.sh && \
conda clean -afy && \
find /opt/miniconda/ -follow -type f -name '*.a' -delete && \
find /opt/miniconda/ -follow -type f -name '*.pyc' -delete && \
rm -rf /root/.cache/pip
conda config --set auto_update_conda false && \
conda config --set solver libmamba && \
rm -rf /root/.cache/pip
20 changes: 20 additions & 0 deletions scripts/docker/gatkbase/stage_docker_base_in_cloud.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash

if [ $# -ne 1 ]; then
echo "Usage: $0 docker_image_version"
exit 1
fi

IMAGE_VERSION=$1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build_docker_base.sh script currently has an (outdated) image version hardcoded in---maybe parametrize it there as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am definitely going to update the old build script before merging this, and also add a separate script to release a pre-staged base image to the official location

IMAGE_NAME="us.gcr.io/broad-dsde-methods/gatk-base-image-staging-area"
DOCKER_IMAGE_TAG="${IMAGE_NAME}:${IMAGE_VERSION}"

gcloud builds submit --tag ${DOCKER_IMAGE_TAG} --timeout=24h --machine-type n1_highcpu_32

if [ $? -ne 0 ]; then
echo "gcloud builds submit failed"
exit 1
fi

echo "Successfully published image to staging area at ${DOCKER_IMAGE_TAG}"
exit 0
2 changes: 1 addition & 1 deletion scripts/gatkcondaenv.yml.template
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ dependencies:

# core python dependencies
- conda-forge::python=3.6.10 # do not update
- pip=20.0.2 # specifying channel may cause a warning to be emitted by conda
- conda-forge::pip=21.3.1
- conda-forge::mkl=2019.5 # MKL typically provides dramatic performance increases for theano, tensorflow, and other key dependencies
- conda-forge::mkl-service=2.3.0
- conda-forge::joblib=1.1.1 # must pin joblib - versions after 1.1.1 no longer support python 3.6
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,17 +158,18 @@ public void testValidInputs(final String tag,
private static void assertOutputs(final String tag,
final String outputPrefix) {
// vcf.idx files are not reproducible
SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s/%s %s",
EXPECTED_TEST_FILES_DIR,
tag + LabeledVariantAnnotationsWalker.ANNOTATIONS_HDF5_SUFFIX,
outputPrefix + LabeledVariantAnnotationsWalker.ANNOTATIONS_HDF5_SUFFIX));
SystemCommandUtilsTest.runSystemCommand(String.format("diff %s/%s.vcf %s.vcf",
EXPECTED_TEST_FILES_DIR, tag, outputPrefix));
SystemCommandUtilsTest.runH5Diff(
String.format("%s/%s", EXPECTED_TEST_FILES_DIR, tag + LabeledVariantAnnotationsWalker.ANNOTATIONS_HDF5_SUFFIX),
String.format("%s", outputPrefix + LabeledVariantAnnotationsWalker.ANNOTATIONS_HDF5_SUFFIX));

SystemCommandUtilsTest.runDiff(
String.format("%s/%s.vcf", EXPECTED_TEST_FILES_DIR, tag),
String.format("%s.vcf", outputPrefix));

if (tag.contains("posUn")) {
SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s/%s %s",
EXPECTED_TEST_FILES_DIR,
tag + ExtractVariantAnnotations.UNLABELED_TAG + ExtractVariantAnnotations.ANNOTATIONS_HDF5_SUFFIX,
outputPrefix + ExtractVariantAnnotations.UNLABELED_TAG + ExtractVariantAnnotations.ANNOTATIONS_HDF5_SUFFIX));
SystemCommandUtilsTest.runH5Diff(
String.format("%s/%s", EXPECTED_TEST_FILES_DIR, tag + ExtractVariantAnnotations.UNLABELED_TAG + ExtractVariantAnnotations.ANNOTATIONS_HDF5_SUFFIX),
String.format("%s", outputPrefix + ExtractVariantAnnotations.UNLABELED_TAG + ExtractVariantAnnotations.ANNOTATIONS_HDF5_SUFFIX));
} else {
Assert.assertFalse(new File(outputPrefix + ExtractVariantAnnotations.UNLABELED_TAG + ExtractVariantAnnotations.ANNOTATIONS_HDF5_SUFFIX).exists());
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import org.broadinstitute.hellbender.utils.io.IOUtils;
import org.broadinstitute.hellbender.utils.io.Resource;
import org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException;
import org.broadinstitute.hellbender.utils.runtime.ProcessController;
import org.testng.Assert;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
Expand Down Expand Up @@ -163,12 +164,17 @@ public void testValidInputs(final String tag,
private static void assertExpectedOutputs(final String tag,
final String outputPrefix) {
// vcf.idx files are not reproducible
SystemCommandUtilsTest.runSystemCommand(String.format("diff %s/%s.vcf %s.vcf",
EXPECTED_TEST_FILES_DIR, tag, outputPrefix));
SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s/%s.annot.hdf5 %s.annot.hdf5",
EXPECTED_TEST_FILES_DIR, tag, outputPrefix));
SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s/%s.scores.hdf5 %s.scores.hdf5",
EXPECTED_TEST_FILES_DIR, tag, outputPrefix));
SystemCommandUtilsTest.runDiff(
String.format("%s/%s.vcf", EXPECTED_TEST_FILES_DIR, tag),
String.format("%s.vcf", outputPrefix));

SystemCommandUtilsTest.runH5Diff(
String.format("%s/%s.annot.hdf5", EXPECTED_TEST_FILES_DIR, tag),
String.format("%s.annot.hdf5", outputPrefix));

SystemCommandUtilsTest.runH5Diff(
String.format("%s/%s.scores.hdf5", EXPECTED_TEST_FILES_DIR, tag),
String.format("%s.scores.hdf5", outputPrefix));
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import org.broadinstitute.hellbender.GATKBaseTest;
import org.broadinstitute.hellbender.exceptions.GATKException;
import org.broadinstitute.hellbender.utils.runtime.ProcessController;
import org.testng.Assert;
import org.testng.annotations.Test;

Expand All @@ -16,47 +17,50 @@
"org/broadinstitute/hellbender/tools/walkers/vqsr/scalable/extract");
private static final File EXPECTED_TEST_FILES_DIR = new File(TEST_FILES_DIR, "expected");

static void runSystemCommand(final String command) {
logger.debug(String.format("Testing command: %s", command));
try {
final ProcessBuilder processBuilder = new ProcessBuilder("sh", "-c", command).redirectErrorStream(true);
final Process process = processBuilder.start();

final BufferedReader stdInReader = new BufferedReader(new InputStreamReader(process.getInputStream()));
String stdInLine;
while ((stdInLine = stdInReader.readLine()) != null) {
Assert.fail(String.format("The command \"%s\" resulted in: %s", command, stdInLine));
}
stdInReader.close();

} catch (final IOException e) {
throw new GATKException.ShouldNeverReachHereException(e.getMessage());
}
static void runH5Diff(final String expected, final String actual) {
final ProcessController controller = ProcessController.getThreadLocal();

// -r: Report mode. Print the differences.
// --use-system-epsilon: Return a difference if and only if the difference between two data values exceeds
// the system value for epsilon.
final String[] command = new String[] { "h5diff", "-r", "--use-system-epsilon", expected, actual };

runProcessAndCaptureOutputInExceptionMessage(controller, command);
}

static void runDiff(final String expected, final String actual) {
final ProcessController controller = ProcessController.getThreadLocal();
final String[] command = new String[] { "diff", expected, actual };
runProcessAndCaptureOutputInExceptionMessage(controller, command);
}

@Test(groups = {"python"}) // python environment is required to use h5diff
public void testRunSystemCommand() {
runSystemCommand(String.format("h5diff %s/extract.AS.indel.pos.annot.hdf5 %s/extract.AS.indel.pos.annot.hdf5",
EXPECTED_TEST_FILES_DIR, EXPECTED_TEST_FILES_DIR));
runSystemCommand(String.format("diff %s/extract.AS.indel.pos.vcf %s/extract.AS.indel.pos.vcf",
EXPECTED_TEST_FILES_DIR, EXPECTED_TEST_FILES_DIR));
public void testRunH5DiffIdenticalFile() {
runH5Diff(String.format("%s/extract.AS.indel.pos.annot.hdf5", EXPECTED_TEST_FILES_DIR),
String.format("%s/extract.AS.indel.pos.annot.hdf5", EXPECTED_TEST_FILES_DIR));
}

@Test
public void testRunDiffIdenticalFile() {
runDiff(String.format("%s/extract.AS.indel.pos.vcf", EXPECTED_TEST_FILES_DIR),
String.format("%s/extract.AS.indel.pos.vcf", EXPECTED_TEST_FILES_DIR));

Check warning on line 46 in src/test/java/org/broadinstitute/hellbender/tools/walkers/vqsr/scalable/SystemCommandUtilsTest.java

View check run for this annotation

Codecov / codecov/patch

src/test/java/org/broadinstitute/hellbender/tools/walkers/vqsr/scalable/SystemCommandUtilsTest.java#L45-L46

Added lines #L45 - L46 were not covered by tests
}

@Test(expectedExceptions = AssertionError.class, groups = {"python"}) // python environment is required to use h5diff
public void testRunSystemCommandH5diffException() {
runSystemCommand(String.format("h5diff %s/extract.AS.indel.pos.annot.hdf5 %s/extract.AS.snp.pos.annot.hdf5",
EXPECTED_TEST_FILES_DIR, EXPECTED_TEST_FILES_DIR));
public void testRunH5DiffException() {
runH5Diff(String.format("%s/extract.AS.indel.pos.annot.hdf5", EXPECTED_TEST_FILES_DIR),
String.format("%s/extract.AS.snp.pos.annot.hdf5", EXPECTED_TEST_FILES_DIR));
}

@Test(expectedExceptions = AssertionError.class)
public void testRunSystemCommandDiffException() {
runSystemCommand(String.format("diff %s/extract.AS.indel.pos.vcf %s/extract.AS.snp.pos.vcf",
EXPECTED_TEST_FILES_DIR, EXPECTED_TEST_FILES_DIR));
public void testRunDiffException() {
runDiff(String.format("%s/extract.AS.indel.pos.vcf", EXPECTED_TEST_FILES_DIR),
String.format("%s/extract.AS.snp.pos.vcf", EXPECTED_TEST_FILES_DIR));

Check warning on line 58 in src/test/java/org/broadinstitute/hellbender/tools/walkers/vqsr/scalable/SystemCommandUtilsTest.java

View check run for this annotation

Codecov / codecov/patch

src/test/java/org/broadinstitute/hellbender/tools/walkers/vqsr/scalable/SystemCommandUtilsTest.java#L57-L58

Added lines #L57 - L58 were not covered by tests
}

@Test(expectedExceptions = AssertionError.class)
public void testRunSystemCommandDiffNoSuchFileException() {
runSystemCommand(String.format("diff %s/blahblah %s/extract.AS.snp.pos.vcf",
EXPECTED_TEST_FILES_DIR, EXPECTED_TEST_FILES_DIR));
public void testRunDiffNoSuchFile() {
runDiff(String.format("%s/blahblah", EXPECTED_TEST_FILES_DIR),
String.format("%s/extract.AS.snp.pos.vcf", EXPECTED_TEST_FILES_DIR));

Check warning on line 64 in src/test/java/org/broadinstitute/hellbender/tools/walkers/vqsr/scalable/SystemCommandUtilsTest.java

View check run for this annotation

Codecov / codecov/patch

src/test/java/org/broadinstitute/hellbender/tools/walkers/vqsr/scalable/SystemCommandUtilsTest.java#L63-L64

Added lines #L63 - L64 were not covered by tests
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import org.broadinstitute.hellbender.tools.walkers.vqsr.scalable.modeling.VariantAnnotationsModelBackend;
import org.broadinstitute.hellbender.utils.io.IOUtils;
import org.broadinstitute.hellbender.utils.io.Resource;
import org.broadinstitute.hellbender.utils.runtime.ProcessController;
import org.testng.Assert;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
Expand Down Expand Up @@ -179,14 +180,13 @@ private static void assertExpectedOutputsForVariantType(final String tag,
final String tagAndVariantType = String.format("%s.%s", tag, variantType);
final String outputPrefixAndVariantType = String.format("%s.%s", outputPrefix, variantType);

SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s/%s %s",
EXPECTED_TEST_FILES_DIR,
tagAndVariantType + TrainVariantAnnotationsModel.TRAINING_SCORES_HDF5_SUFFIX,
outputPrefixAndVariantType + TrainVariantAnnotationsModel.TRAINING_SCORES_HDF5_SUFFIX));
SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s/%s %s",
EXPECTED_TEST_FILES_DIR,
tagAndVariantType + TrainVariantAnnotationsModel.CALIBRATION_SCORES_HDF5_SUFFIX,
outputPrefixAndVariantType + TrainVariantAnnotationsModel.CALIBRATION_SCORES_HDF5_SUFFIX));
SystemCommandUtilsTest.runH5Diff(
String.format("%s/%s", EXPECTED_TEST_FILES_DIR, tagAndVariantType + TrainVariantAnnotationsModel.TRAINING_SCORES_HDF5_SUFFIX),
String.format("%s", outputPrefixAndVariantType + TrainVariantAnnotationsModel.TRAINING_SCORES_HDF5_SUFFIX));

SystemCommandUtilsTest.runH5Diff(
String.format("%s/%s", EXPECTED_TEST_FILES_DIR, tagAndVariantType + TrainVariantAnnotationsModel.CALIBRATION_SCORES_HDF5_SUFFIX),
String.format("%s", outputPrefixAndVariantType + TrainVariantAnnotationsModel.CALIBRATION_SCORES_HDF5_SUFFIX));

assertScorerExpectedOutputs(tagAndVariantType, outputPrefixAndVariantType);

Expand Down Expand Up @@ -252,12 +252,13 @@ public void testSNPOnlyModelsFromSNPOnlyAndSNPPlusIndelAnnotationsAreIdentical()
.apply(argsBuilderSNPPlusIndel);
runCommandLine(argsBuilderSNPPlusIndel);

SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s %s",
SystemCommandUtilsTest.runH5Diff(
outputPrefixSNPOnly + ".snp" + TrainVariantAnnotationsModel.TRAINING_SCORES_HDF5_SUFFIX,
outputPrefixSNPPlusIndel + ".snp" + TrainVariantAnnotationsModel.TRAINING_SCORES_HDF5_SUFFIX));
SystemCommandUtilsTest.runSystemCommand(String.format("h5diff %s %s",
outputPrefixSNPPlusIndel + ".snp" + TrainVariantAnnotationsModel.TRAINING_SCORES_HDF5_SUFFIX);

SystemCommandUtilsTest.runH5Diff(
outputPrefixSNPOnly + ".snp" + TrainVariantAnnotationsModel.CALIBRATION_SCORES_HDF5_SUFFIX,
outputPrefixSNPPlusIndel + ".snp" + TrainVariantAnnotationsModel.CALIBRATION_SCORES_HDF5_SUFFIX));
outputPrefixSNPPlusIndel + ".snp" + TrainVariantAnnotationsModel.CALIBRATION_SCORES_HDF5_SUFFIX);
}

@Test(expectedExceptions = IllegalArgumentException.class) // python environment is required to run tool
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,17 @@ public static void runProcess(final ProcessController processController, final S
runProcess(processController, command, "Process exited with non-zero value. Command: "+ Arrays.toString(command) + "\n");
}

/**
* run an external process and assert that it finishes with exit code 0
* if it exits with non-zero, include the process stdout/stderr in the exception message
*
* @param processController a ProcessController to use
* @param command command to run, the 0th element must be the executable name
*/
public static void runProcessAndCaptureOutputInExceptionMessage(final ProcessController processController, final String[] command) {
runProcess(processController, command, null,"Process exited with non-zero value. Command: "+ Arrays.toString(command) + "\n", true);
}

/**
* run an external process and assert that it finishes with exit code 0
* @param processController a ProcessController to use
Expand All @@ -71,12 +82,37 @@ public static void runProcess(final ProcessController processController, final S
* @param message error message to display on failure
*/
public static void runProcess(final ProcessController processController, final String[] command, final Map<String, String> environment, final String message) {
runProcess(processController, command, environment, message, false);
}

/**
* run an external process and assert that it finishes with exit code 0
* @param processController a ProcessController to use
* @param command command to run, the 0th element must be the executable name
* @param environment what to use as the process environment variables
* @param message error message to display on failure
* @param includeProcessOutputInExceptionMessage if true, include the process's stdout/stderr (if any) in the exception
* message thrown upon exit with a non-zero value
*/
public static void runProcess(final ProcessController processController, final String[] command, final Map<String, String> environment, final String message, final boolean includeProcessOutputInExceptionMessage) {
final ProcessSettings prs = new ProcessSettings(command);
prs.getStderrSettings().printStandard(true);
prs.getStdoutSettings().printStandard(true);

// Capture stdout/stderr to a buffer if we need to include it in the exception message
if ( includeProcessOutputInExceptionMessage ) {
prs.getStderrSettings().setBufferSize(65536);
prs.getStdoutSettings().setBufferSize(65536);
}

prs.setEnvironment(environment);
final ProcessOutput output = processController.exec(prs);
Assert.assertEquals(output.getExitValue(), 0, message);

final String fullMessage = includeProcessOutputInExceptionMessage ?
message + "\n***Process stdout:***\n" + output.getStdout() + "\n***Process stderr:***\n" + output.getStderr() :
message;

Assert.assertEquals(output.getExitValue(), 0, fullMessage);
}

/**
Expand Down
Loading