Skip to content
/ ELTDM Public

Fast keyword extraction from text using graph degeneracy-based approaches

Notifications You must be signed in to change notification settings

JCCen/ELTDM

Repository files navigation

Fast keyword extraction from text using graph degeneracy-based approaches

Authors : Romain Avouac, Jaime Costa Centena

This is our final project for the ELTDM (software guidelines to process massive data) course at ENSAE. Our purpose was to find computationally efficient ways of performing keyword extraction from text using graph degeneracy criteria, as described in Tixier, Malliaros & Vazirgiannis (2016).

We focused on two major steps of the data processing pipeline : k-core decomposition to identify dense subgraphs (notebook), and computation of the elbow criteria to select relevant keywords (notebook). For each part, we provide extensive performance comparison for all the approaches we implemented (including cythonization, multithreading, multiprocessing). We provide an in-depth discussion (in French) of our results in a report.

About

Fast keyword extraction from text using graph degeneracy-based approaches

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published