Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QeneQuant raw reads vs normalized #39

Open
davidsanin opened this issue May 4, 2023 · 2 comments
Open

QeneQuant raw reads vs normalized #39

davidsanin opened this issue May 4, 2023 · 2 comments

Comments

@davidsanin
Copy link

davidsanin commented May 4, 2023

Hi! Thanks for the cool tool!
When I start from estimated read counts from a different pipeline (STAR > featureCounts) should I provide Taiji with raw reads or should I normalise them somehow before running?

My input looked like this and everything went smoothly.

ATAC-seq:
  - id: Naive_ATAC
    group: Naive
    replicates:
      - rep: 1
        files:
          - path: /ATAC/Naive_1.short.cleaned.bam
            tags: ['PairedEnd']

RNA-seq:
  - id: Naive_RNA
    group: Naive
    replicates:
      - rep: 1
        files:
          - path: /RNA_counts/Naive_1.tsv
            tags: ['GeneQuant']

Also my ATAC BAM files were pre-filtered (removed duplicates and longer fragments). Should I add the "Filtered" tag to it?
I just wanted to make sure I was not missing something! Thanks!!

@kaizhang
Copy link
Member

kaizhang commented May 4, 2023

Normalized gene expression should be used. Adding the "Filtered" tag won't affect the result but will speed up the processing.

@davidsanin
Copy link
Author

Thanks so much for your quick response!
Also, would you mind clarifying some of the meaning of the columns in the network output files?

in edges_binding, what is "affinity"?

# A tibble: 3,478,330 × 8
   `:START_ID`           `:END_ID` chr   `start:int` `end:int` annotation affinity `:TYPE`
   <chr>                 <chr>     <chr>       <dbl>     <dbl> <chr>         <dbl> <chr>  
 1 NFAC4_MOUSE.H11MO.0.C UBL5      chr9     20642715  20642725 promoter      0.543 BIND   
 2 ELK3_MOUSE.H11MO.0.D  UBL5      chr9     20642721  20642733 promoter      0.540 BIND   
 3 RUNX3_MOUSE.H11MO.0.A UBL5      chr9     20642725  20642737 promoter      0.532 BIND 

in edges_combined, how is this weight calculated?

# A tibble: 2,164,825 × 4
   `:START_ID`           `:END_ID` weight `:TYPE`          
   <chr>                 <chr>      <dbl> <chr>            
 1 COT2_MOUSE.H11MO.2.B  UBL5      0.0592 COMBINED_REGULATE
 2 CTCF_MOUSE.H11MO.0.A  UBL5      0.0612 COMBINED_REGULATE
 3 CTCFL_MOUSE.H11MO.0.A UBL5      0.0615 COMBINED_REGULATE

In nodes, the expressionZscore is centered how?

# A tibble: 25,321 × 3
   `geneName:ID` expression expressionZScore
   <chr>              <dbl>            <dbl>
 1 0610005C13RIK      0.898            4.98 
 2 0610006L08RIK      0.333            0.316

Thanks again!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants