Fall 2020 - Computational Medicine course project - Biomarker Discovery on Immunological Data (HW 2)
This project uses feature selection to identify inflammatory biomarkers that can distinguish between one of three conditions in children:
- SARS-CoV-2
- Multi-system Inflammatory Syndrome in Children (MIS-C)
- Kawasaki disease
The project involves experimenting with three major categories of feature selection (filter-based, wrapper-based, embedded), applying standard techniques for preprocessing and training (standardization, encoding, cross-validation), and using multiple machine learning techniques (Mutual Information, Recursive Feature Elimination, Random Forest Classifier, SVM) to identify biomarkers. The project also employs a permutation test strategy for identifying and ignoring spurious correlation.
Analysis was performed using Jupyter Notebook and Python.