FRASER maxing out memory on sge cluster #29

Jessen-Erik · 2021-09-24T15:23:15Z

Hello!

We are attempting to use FRASER on a cohort of 400 samples. We've been experiencing issues completing FRASER when sending the job to our sge queue, even when providing 1.5TB of memory. It seems the parallelization of the PSI calculation (fds <- calculatePSIValues(fds,BPPARAM=BPPARAM)) is causing the job to go over our h_vmem allocation. We've attempted to force FRASER to run in serial (BPPARAM=SerialParam()), but encounter the same issue maxing out of the memory.

Is it possible that FRASER is ignoring the serial setting?

Is there a quick fix for this? Or does a solution similar to the link below have to be implemented:
gagneurlab/OUTRIDER#11

Thank you for the tool, we've really enjoyed running it on some previous cohorts.

ischeller · 2021-09-30T13:57:26Z

Hi @Jessen-Erik ,
thanks for trying out FRASER!
Regarding your problem, can you check the dimension of your fds object that you use as input for this step? As we typically run this step before filtering, I suspect that you could have a quite large fds object and that this is causing the problem rather than the parallelization itself. If this is indeed the case, you could try applying the minExpressionInOneSample filter before the PSI calculation step (we provide the option to do this as part of the countRNAData function), as this typically reduces the number of junctions inside the fds object a lot.

Jessen-Erik · 2021-09-30T19:03:23Z

I checked the dimensions and size of the file: dim(fds) [1] 3605173 406 object.size(fds) 65659480 bytes What is the default minExpressionInOneSample? Just 1 read? From: Ines Scheller ***@***.***> Sent: Thursday, September 30, 2021 8:58 AM To: c-mertes/FRASER ***@***.***> Cc: Jessen, Erik, Ph.D. ***@***.***>; Mention ***@***.***> Subject: [EXTERNAL] Re: [c-mertes/FRASER] FRASER maxing out memory on sge cluster (#29) Hi @Jessen-Erik<https://github.com/Jessen-Erik> , thanks for trying out FRASER! Regarding your problem, can you check the dimension of your fds object that you use as input for this step? As we typically run this step before filtering, I suspect that you could have a quite large fds object and that this is causing the problem rather than the parallelization itself. If this is indeed the case, you could try applying the minExpressionInOneSample filter before the PSI calculation step (we provide the option to do this as part of the countRNAData function), as this typically reduces the number of junctions inside the fds object a lot. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANAFPABEJKKDWKCCCQGI7R3UERUFDANCNFSM5EWGEXNA>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

c-mertes · 2022-01-25T14:08:30Z

We are sorry that we did not reply anymore. Thanks for sharing the dimensions. 3mio junctions is big, but should not require 1.5Tb memory. For this purpose 1 read is enough as it is used only to remove random alignments.

Since there was no further response, I assume that the minExpressionInOneSample filter step helped here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FRASER maxing out memory on sge cluster #29

FRASER maxing out memory on sge cluster #29

Jessen-Erik commented Sep 24, 2021

ischeller commented Sep 30, 2021

Jessen-Erik commented Sep 30, 2021 via email

c-mertes commented Jan 25, 2022

FRASER maxing out memory on sge cluster #29

FRASER maxing out memory on sge cluster #29

Comments

Jessen-Erik commented Sep 24, 2021

ischeller commented Sep 30, 2021

Jessen-Erik commented Sep 30, 2021 via email

c-mertes commented Jan 25, 2022