This contains the codebase and the annotated relevant trial set for each of the 25 queries, for the paper titled Towards an Aspect-based Ranking Model for Clinical Trial Search accepted in the 8th International Conference on Computational Data and Social Networks (CSoNet 2019).
Paper link: [https://link.springer.com/chapter/10.1007/978-3-030-34980-6_25]
-
Dump Of the Clinical Trials from the follwing link. [https://clinicaltrials.gov/AllPublicXML.zip]
-
Setup of QuickUMLS tool. [https://github.com/Georgetown-IR-Lab/QuickUMLS]
-
Download Adversity Events Reported from the site. [https://aact.ctti-clinicaltrials.org/pipe_files]
-
Download Elastic Search for the baseline.
-
Baseline code [https://github.com/ajinkyathorve/TREC-2017-PM-CDS-Track].
-
All environment packages in [requirements.txt] file.
- It contains the 25 annotated queries.
- 5 queries for each disease class.
baselineSetup [https://github.com/ajinkyathorve/TREC-2017-PM-CDS-Track]
- Scripts for indexing trials for each class of disease and script for retriving and ranking trials on the basis of query.
- Calculates the precision, speraman's rank order correlation and overlap across 25 final queries retrieved trials and ranked on the basis of relevancy(5-methods)
- QuickUMLS tool applied over 1440 lexicons of medDRA common patient terms
- Finds the problematic queries.
- Get PubMedIds for the clinical trials.
- Map clinical trials linked with PubMed-Ids to different classes(26) of disease with all extra fields(Adversity, popularity) appended.
Robustness study
- Precision@10
- Recall
- Contains trials in ranked order on the basis of different relevancy (pageRank, Exact Match, SynSet) based approach.
- Rank retrieved trials for 25 quries on the basis of different relevancy(25) methods.
- 5 class Csv Files with all fields appended.
- Pickle File of UMLS concepts for each trial across 5 disease classes.
- Ranked trials on the basis of 5 relevancy based methodologies.
- Dump files.
- Contains application of different clustering algorithm like DBScan, Affinity Algorithms on different variations of the data.
If you use the codes or the dataset, please cite the paper.
@InProceedings{10.1007/978-3-030-34980-6_25,
author="Roy, Soumyadeep
and Rudra, Koustav
and Agrawal, Nikhil
and Sural, Shamik
and Ganguly, Niloy",
editor="Tagarelli, Andrea
and Tong, Hanghang",
title="Towards an Aspect-Based Ranking Model for Clinical Trial Search",
booktitle="Computational Data and Social Networks",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="209--222",
doi="10.1007/978-3-030-34980-6_25",
isbn="978-3-030-34980-6"
}