Skip to content
/ sofia Public

Robust Factorization of Real-world Tensor Streams with Patterns, Missing Values, and Outliers (ICDE'21)

License

Notifications You must be signed in to change notification settings

wooner49/sofia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Robust Factorization of Real-world Tensor Streams with Patterns, Missing Values, and Outliers (ICDE'21)

This repository contains the source code for the paper Robust Factorization of Real-world Tensor Streams with Patterns, Missing Values, and Outliers, by Dongjin Lee and Kijung Shin, presented at ICDE 2021.

In this work, we propose SOFIA, an online algorithm for factorizing real-world tensors that evolve over time with missing entries and outliers. By smoothly and tightly combining tensor factorization, outlier detection, and temporal-pattern detection, SOFIA achieves the following strengths over state-of-the-art competitors:

  • Robust and accurate: SOFIA yields up to 76% and 71% lower imputation and forecasting error than its best competitors.
  • Fast: Compared to the second-most accurate method, using SOFIA makes imputation up to 935X faster.
  • Scalable: SOFIA incrementally processes new entries in a time-evolving tensor, and it scales linearly with the number of new entries per time step.

Datasets

Name Description Size Granularity in Time Processed Dataset Original Source
Intel Lab Sensor locations x sensor x time 54 x 4 x 1152 every 10 minutes Dataset Link
Network Traffic sources x destinations x time 23 x 23 x 2000 hourly Dataset Link
Chicago Taxi sources x destinations x time 77 x 77 x 2016 hourly Dataset Link
NYC Taxi sources x destinations x time 265 x 265 x 904 daily Dataset Link

Requirements

  1. Tensor Toolbox v3.1 for tensor computation.
    • Download and link the library.
  2. Optimization Toolbox for non-linear programming solver (fmincon function in Matlab).

Running Examples

We provide two running example codes for online tensor completion and forecasting, respectively.

  1. Online tensor completion
  2. Tensor forecasting

Supplementary Document

Please see supplementary

Reference

This code is free and open source for only academic/research purposes (non-commercial). If you use this code as part of any published research, please acknowledge the following paper.

@inproceedings{lee2021robust,
  title={Robust factorization of real-world tensor streams with patterns, missing values, and outliers},
  author={Lee, Dongjin and Shin, Kijung},
  booktitle={2021 IEEE 37th International Conference on Data Engineering (ICDE)},
  pages={840--851},
  year={2021},
  organization={IEEE}
}

About

Robust Factorization of Real-world Tensor Streams with Patterns, Missing Values, and Outliers (ICDE'21)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published