Bayesian Probabilistic Tensor Factorization
Intro
This pages gives you the Bayesian Probabilistic Tensor
Factorization (BPTF) algorithm described in the
following paper:
Liang Xiong, Xi Chen, Tzu-kuo Huang, Jeff Schneider, and Jaime
Carbonell, Temporal Collaborative Filtering with Bayesian
Probabilistic Tensor Factorization, SIAM Data Mining 2010 (SDM
10). [pdf]
As a gift/for comparison, the Probabilistic Matrix Factorization and Bayesian Probabilistic Matrix Factorization are also provided.
Code
Matlab Code
This demo of BPTF is written in Matlab with mex. To run it, you
need Matlab, some CPP compiler that Matlab supports, and a fast
machine. All you have to do is to downloaded the code,
uncompress it, and run "demo.m".
In case you are interested in fast
and parallel processing:
the GraphLab
contains an
awesomely fast implementation of this algorithm. Check it out.
Misc
Here are some details you might be interested in:
-
The data provided is extracted from 4% of
the Netflix data
set. 20% users and 20% movies were randomly selected from the
whole pool. Each user or movie is just a coordinate in the
tensor, and no other information is conveyed (If you find this
still violates people's rights, let me know.). The time index
in the tensor is based on calendar months, except that some
early months were combined together.
-
We represent a sparse tensor by a Matlab struct with
fields: subs, vals,
size. subs contains the coordinates
of the entries, one row per entry. vals
contains the values for these entries.
And size is the dimension of the tensor.
-
The intialization method used here is based on alternating
least squares. Its prediction is not as good as the
gradient-based one used in the paper, but it needs less tuning
and runs faster.
Liang Xiong,
2010-11-15