Skip to content

danitoribio/Rank-page-Wikipedia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Rank page of the Wikipedia with Spark

The objective of this project is to assign a rank to each page of the Wikipedia using Spark. The first part of the project is data preprocessing and then page rank algorithm is used. I recommend to use DataBricks to be able to run the code and use a small sample of the Wikipedia pages.

This project is developed with python and spark.

Releases

No releases published

Packages

No packages published