Skip to content

scRNA seq data processing

Hyunsoo Kim edited this page Feb 9, 2023 · 1 revision

scRNA-seq data processing

Step 1: align sequences in scRNA-seq FASTQ files to GRCh38 reference transcriptome by 10x Genomics cellranger count to obtain two filtered_feature_bc_matrix.h5 files for two samples.

Step 2: Make the following directoy structure with copy or link.

../count_er+bc-pairs
├── Tumor5
│   ├── outs
│   │   └── filtered_feature_bc_matrix.h5
└── Tumor5_TAM
    └── outs
        └── filtered_feature_bc_matrix.h5

Step 3: Make Seurat object for each sample with the following command:

./make_sc-rna-seq_seurat_obj.R --dir_count ../count_er+bc-pairs --dir_output ./output_er+bc-pairs --dir_seurat_obj ./output_er+bc-pairs/rds_er+bc-pairs --type_qc arguments --min_ncount_rna 5000 --min_nfeature_rna 2000 --th_percent.mt 25 --max_dimstouse 30 --seurat_resolution 0.8 --method_to_update_cell_types epithelial_cell_types --method_to_identify_subtypes none --type_infercnv_argset vignettes --infercnv_pos_notpos er+bc-pairs Tumor5

The above example is only for Tumor5, you can make another Seurat object for Tumor5_TAM by changing the last argument. The contents of the output directory of "./output_er+bc-pairs" follows:

output_er+bc-pairs/
├── infercnv
│   ├── er+bc-pairs_Tumor5_cnv_postdoublet
│   └── er+bc-pairs_Tumor5_TAM_cnv_postdoublet
├── log
├── rds_er+bc-pairs
│   ├── er+bc-pairs_Tumor5_sc-rna-seq_sample_seurat_obj.rds
│   ├── er+bc-pairs_Tumor5_TAM_sc-rna-seq_sample_seurat_obj.rds
│   └── wilcox_degs
├── tsv
│   ├── infercnv_input_barcode_group_er+bc-pairs_Tumor5.tsv
│   └── infercnv_input_barcode_group_er+bc-pairs_Tumor5_TAM.tsv
└── xlsx
    ├── er+bc-pairs_Tumor5_sc-rna-seq_pipeline_summary.xlsx
    └── er+bc-pairs_Tumor5_TAM_sc-rna-seq_pipeline_summary.xlsx

Step 4: Merge Seurat objects for multiple samples to make merged Seurat object by the following command:

./make_sc-rna-seq_merged_seurat_obj.R --dir_output ./output_er+bc-pairs --dir_seurat_obj ./output_er+bc-pairs/rds_er+bc-pairs --k.anchor 5 --max_dimstouse 30 --seurat_resolution 0.8 --cancer_type_for_parsing_rds_filename er+bc-pairs --type_parsing_rds_filename_for_donor 2nd_item_after_parsing_with_underbar --harmony_theta 0  er+bc-pairs

The output file is located under ./output_er+bc-pairs/rds_er+bc-pairs that was defined by an argument of --dir_seurat_obj.

output_er+bc-pairs/
│   ...
├── rds_er+bc-pairs
│   ├── er+bc-pairs_Tumor5_sc-rna-seq_sample_seurat_obj.rds
│   ├── er+bc-pairs_Tumor5_TAM_sc-rna-seq_sample_seurat_obj.rds
│   ├── er+bc-pairs_sc-rna-seq_merged_seurat_obj.rds
│   └── wilcox_degs
...
Clone this wiki locally