Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store additional metadata for ExpressionExperiment and BioAssay in the database #668

Draft
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

arteymix
Copy link
Member

@arteymix arteymix commented Apr 27, 2023

We currently store EE metadata under metadata/{shortName} and detect a certain number of file formats organized in subdirectories.

This has however limitations because we cannot associate EEs with arbitrary metadata files, and we need to adjust the code if new format are to be supported. We also have no solution right now for attaching metadata to individual samples like FastQC reports.

This proposal here is storing blobs in the database with more flexible metadata and allows us to attach them at both the dataset and individual sample levels.

  • update ExpressionExperimentDataFetchController to also list metadata found in the database
  • fully implement CLI arguments for importing metadata

@arteymix arteymix requested a review from ppavlidis April 27, 2023 13:19
@arteymix arteymix force-pushed the feature-additional-metadata-in-database branch from 6675042 to 105187e Compare April 27, 2023 13:23
Metadata can be attached at both ExpressionExperiment and BioAssay
levels.
@arteymix arteymix force-pushed the feature-additional-metadata-in-database branch from 105187e to 026e3c8 Compare April 27, 2023 13:23
* <p>
* Example: a MultiQC report on a {@link ExpressionExperiment}
*/
SEQUENCING_OVERALL_REPORT,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ppavlidis I'm looking for some feedback for possible values to put here. If there are things that might make sense to include for microarray platforms like the output of APT tools, etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already some infrastructure to support this (surfaced on diagnostics tab) but it is a low priority now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed it this morning and I synchronized all the MultiQC reports we have generated from the RNA-Seq pipeline.

I also finished adding a new argument to rnaseqDataAdd to take care of copying over the report in Gemma data directory so that it can be integrated fully in the pipeline.

@ppavlidis ppavlidis added this to the 1.31.0 milestone Apr 27, 2023
@arteymix arteymix force-pushed the development branch 2 times, most recently from 84692c7 to e0ec3da Compare December 4, 2023 19:58
@arteymix arteymix modified the milestones: 1.31.0, 1.32.0 Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants