Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow backfilling of counts and rpkm without triggering preprocessing #771

Open
wants to merge 8 commits into
base: development
Choose a base branch
from

Conversation

ppavlidis
Copy link
Collaborator

@ppavlidis ppavlidis commented Jul 17, 2023

To address curation issue 362.

By using the the noLog2cpm option for RNASeqDataAddCli should allow us to backfill counts and rpkm for old data sets in a minimally-invasive way.

Needs testing.

TODO

  • unit test for the CLI tool and argument parsing
  • unit test for skipping log2cpm recomputation and post-processing
  • test case when library size differs

@arteymix
Copy link
Member

Ideally we would just compare the log2cpm with the new one and trigger post-processing if they differ.

I'll write some unit tests for the CLI.

@arteymix arteymix self-assigned this Jul 24, 2023
@arteymix
Copy link
Member

I'll write unit tests instead of adding an integration test because it's impractical to check if the preprocessing service has been interacted with.

verify( expressionExperimentService, times( 2 ) ).addRawVectors( same( ee ), any() );
dataUpdater.addCountData( ee, ad, countMatrix, rpkmMatrix, 30, false, false, true );
verify( expressionExperimentService ).replaceRawVectors( same( ee ), any() );
verifyNoMoreInteractions( preprocessorService );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ppavlidis I've added the unit test here. It checks that the preprocessor service is only invoked once.

I'll add a test to see how differing library sizes are handled. Let me know if you want other cases covered.

ba.setSequencePairedReads( isPairedReads );
}

if ( ba.getSequenceReadCount() != null && ba.getSequenceReadCount() != librarySize && requireExistingLibrarySizesMatch ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a test case for this specific condition.

@arteymix arteymix added this to the 1.30.1 milestone Aug 9, 2023
@arteymix arteymix modified the milestones: 1.30.1, 1.30.2, 1.30.3 Sep 14, 2023
@arteymix arteymix modified the milestones: 1.30.3, 1.31.0 Oct 6, 2023
@arteymix arteymix force-pushed the development branch 2 times, most recently from 84692c7 to e0ec3da Compare December 4, 2023 19:58
@arteymix arteymix modified the milestones: 1.31.0, 1.32.0 Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants