Skip to content

Implementation of: Clustering of the structures by using "snakes & dragons" approach, or correlation matrix as a signal - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0223267

License

Notifications You must be signed in to change notification settings

MrinalJain17/drake

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Clustering by using "snakes and dragons"

Implementation of the paper Clustering of the structures by using "snakes & dragons" approach, or correlation matrix as a signal

Datasets

  1. Macroeconomics development indicators from the World Bank - Link

Requirements

  • NumPy

  • Pandas

  • Scikit-learn

  • Tqdm (for displaying a progress bar)

  • Yellowbrick (provides mechanism for selecting the best number of clusters k, as described in the paper)

    To install using the conda package mamager (recommended):

    conda install -c districtdatalabs yellowbrick

Optional requirements

The algorithm internally uses KMeans multiple times on random partitions of the entire dataset. Although sklearn's implementation of K-Means is widely used, it is not the fastest out there. Intel-backed DAAL's implementation was found to be much faster in the initial benchmarks, giving almost 8-12x speed-up. If DAAL is not installed, then the code will fallback to use the sklearn's implementation.

The recommended way to install DAAL for python would be using the conda package manager:

conda install -c intel daal4py

Refrences

  1. Consensus Clustering (paper): https://link.springer.com/article/10.1023/A:1023949509487
  2. Consensus Clustering (blog): https://towardsdatascience.com/consensus-clustering-f5d25c98eaf2

About

Implementation of: Clustering of the structures by using "snakes & dragons" approach, or correlation matrix as a signal - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0223267

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published