Skip to content

Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation in R and Python

License

Notifications You must be signed in to change notification settings

stathwang/FOSSIL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FOSSIL: Factorized Sequential Prediction with Item Similarity Models

Paper: Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation (He and McAuley, 2016)

This repository contains my implementation of FOSSIL in R that predicts a user's next purchase based on the past purchase history. The algorithm uses a similarity-based method to reduce sparsity in modeling user preferences as well as higher order Markov chains to smooth user preferences across multiple time steps.

FOSSIL

Main Ideas

FOSSIL models both long-term user preference (matrix factorization) and short-term sequential dynamics (markov chains). Previously, there had been several attempts to model sequential dynamics and subsequently combine it with matrix factorization:

  • Factorized Markov Chains (FMC) that factorizes the item-item transition matrix (non-personalized)
  • Tensor Factorization (TF; Rendle et al. 2008) where a dataset is represented by a tensor and its entries capture the likelihood that users transition from one item to another
  • Factorized Personalized Markov Chains (FPMC; Rendle et al. 2010) that factorizes the user-item matrix and considers Markov chains of first order.

FOSSIL is similar in spirit to FPMC but takes an inspiration from FISM (Factored Item Similarity Models for Top-N Recommender Systems; Kabbur et al. 2013) to factorize item-item matrix in order to reduce sparsity by not having to explicitly model users and relaxes the first order Markov chain assumption of FPMC to higher orders.

FOSSIL2

Since here we are interested in top-k recommendation (typically k = 10), the algorithm minimizes the S-BPR (Sequential Bayesian Personalized Ranking) loss using stochastic gradient descent, one training sample at a time, to optimize the model parameters. S-BPR uses a sigmoid function to characterize the probability that a true item is ranked higher than a false item given a user and the model parameters, assuming independence of users and time steps.

Instructions

Run the preprocess.R script on movielens_trunc.csv file and then the fossil.R on the training set produced. I'm going to upload the python version of the algorithm as well as functions for evaluation metrics soon, so stay tuned.

About

Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation in R and Python

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages