Skip to content

Latest commit

 

History

History
26 lines (15 loc) · 1.24 KB

README.md

File metadata and controls

26 lines (15 loc) · 1.24 KB

~~ Project still in progress. Please Check the project board to follow the upcoming updates ~~

This work is about Playing with Twitter Data in R!

Steps covered in this projets:

The first step is to configure and to conenct to the Twitter API. Since this work is not going to be a "Web Scrapping" Projet, we will need to connect directly to Twitter Data streams through thier API.

From here we will extract the tweets text and other information that we will import from the Twitter API.

In order for us to analyze the tweets, this particular type of data needs to be cleaned and stored in a certain way. Ponctuations, White spaces, Stop Words..and others should be cleaned and handeled in a specific way.

A document-term matrix or term-document matrix is a mathematical matrix that describes the frequency of terms that occur in a collection of documents.