Skip to content

Search Relevance Surveys and Deep Learning: Turning Noisy, Crowd-sourced Opinions Into An Accurate Relevance Judgement (T175048)

Notifications You must be signed in to change notification settings

wikimedia-research/Discovery-Search-Adhoc-RelevanceSurveys

Repository files navigation

Search Relevance Surveys

By EBernhardson

Analysis of the 3rd running of the search relevance surveys (T175048).

Setup

Libraries

Since T178096 is done, apply the roles discovery::learner or discovery::allstar_cruncher to instances on Wikimedia Cloud (formerly Wikimedia Labs).

Packages

# Essentials:
install.packages(c("tidyverse", "caret", "MLmetrics", "mlbench"))
# For bnclassify:
source("https://bioconductor.org/biocLite.R")
biocLite(c("RBGL", "Rgraphviz"))
# Classifiers:
install.packages(c("xgboost", "C50", "klaR", "e1071", "randomForest", "bnclassify", "keras"))
# Metanalysis:
install.packages("betareg")

TODO

  • Tune & train a bunch of classifiers (thanks, caret!)
  • Figure out which sets of features yield the best predictive performance
  • Investigate a multi-level approach based on Discernatron reliability (sort of?)
  • Investigate a stacking / super learning approach
  • Investigate how many responses & impressions we need to get reliable score estimates
  • Write-up

Scripts

  1. Data
    • pageviews.R uses the Wikimedia Analytics Pageviews API to fetch a month worth of daily pageview counts for the relevant articles
    • events.R fetches the survey data from Event Logging database
    • discernatron.R fetches relevance scores from Discernatron's API
    • data.R combines fetched pageviews, survey data, and Discernatron scores into complete datasets
  2. Model Tuning & Training via models.R
    • Outputs models/model-index.csv
    • keras.R has the code for training a deep neural network with Keras and outputs models/keras-index.csv
  3. Model Evaluation via evaluate.R
    • Outputs models/model-accuracy.csv
    • Note that keras.R computes accuracy as part of the training process

Production

To use the final model for predicting relevance of any query-page combination from users' survey responses, please refer to these instructions.