This project contains the back end functionality of a web application that displays near real-time data over a socket. It can be used in combination with the Twitter dashboard client project, to serve near real-time Twitter data to a web interface. Check the video demo for context.
Given that this project was originally created to work with a Twitter Streaming API data source,
log-in must be done trough the official Twitter page. This log-in process will generate two values: oauth_token
and oauth_verifier
. This last one is used to obtain a token_secret, which is stored here in the back-end.
Within the project there are certain entities called handlers. These entities manage several resources, by now:
- Token secrets (accessible by user account - user token).
- Running streams.
There are also entities called taggers, which are supposed to add tags / labels to the originally provided data. By now, there is only one tagger, a hierarchical ML model, which is in charge of inferring the sentiment of each piece of text, assigning a label to the resulting data structure.
If you want to learn more about how that ML model was trained, check my SentimentAI project.
The data communication between a supposed back-end and this front-end client was designed to be done in an asynchronous manner, meaning each time a data point (tweet) is retrieved by a data stream, it must be send over the open socket (allowing real-time visualization).
The technologies involved in making it possible are:
- The Flask WSGI.
- The Flask SocketIO adaptation package.
- The Google geocoding API (transforms a region description, into a pair of geo-located points).
Each data point interchanged between back-end and any front-end have the following structure:
{
"coords": [123, -75],
"label": "neutral",
"source": "android",
"text": "This is just an example",
}
The Python modules has been organized in the following structure:
/src:
# Files
app.py: Flask application entry point
# Folders
/handlers: contains server managed entities.
secrets.py: manages token secrets.
streams.py: manages Twitter data streams.
...
/taggers: containes data taggers.
/sentiment: sentiment inferrer.
...
/twitter: contains the Twitter connecting functionality.
...
/utils: contains utilities modules.
...
...
requirements.txt: project dependencies
When deploying this back-end server, execute the following commands:
pip install -r requirements.txt
# If on a Linux machine, install gevent for extra socket performance
# sudo apt-get install gevent
python3 src/app.py