Skip to content

🔥 Predictor of the political bias of German texts 🇩🇪

Notifications You must be signed in to change notification settings

axenov/politik-news

Repository files navigation

Political Bias Classification of German Media

This project is the first attempt to do Political Bias classification of German news.

Check out paper: Fine-grained Classification of Political Bias in German News: A Data Set and Initial Experiments

We crawled out data from various German news sites using news-please library. After that, we manually cleaned the data and labeled it using Medienkompass. Then the dataset was preprocessed using HuggingFace NLP library.

Due to copyright issues, we can not publish the data, but we provided the list of URLs you can use to build this dataset on your own. To download all the data run:

NewsPlease.from_file('urls/urls.txt')

Then run the preprocessing script:

python preprocess.py -data_folder='path/to/your/downloaded/data'

We evaluated several classification models on the dataset, using Bag-of-Words, TF-IDF, and BERT features. For reproduction the former two, run BOW_baseline.ipynb and TFIDF_baseline.ipynb notebooks. To train BERT-based models you need to fine-tune HuggingFace implementation of German BERT.

python train.py -data_folder="data" model_folder="models/BERT" -batch_size=8 -num_epochs=2

After that run BERT_baseline.ipynb notebook.

Using our two based models for TF-IDF and BERT features, we implemented the demo system that can predict the political bias of a single arbitary text and generate the list of the words that pushes the system to make the decision. The models can be download from here. To use the system run:

python predict.py -file_path="text_sample.txt" -method="tfidf" -explain=False

or call in python:

from BiasPredictor import biasPredictor
predictor = biasPredictor("bert")
prediction = predictor.predict(text = "Ein politischer Text", explain=True)

t-SNE on SVD of BOW representation of the dataset



Effect of Covid-19 on German news



About

🔥 Predictor of the political bias of German texts 🇩🇪

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published