Overview
CDTF (Coordinate Descent for Tensor Factorization) and
SALS (Subset Alternating Least Square) are
-
tensor factorization algorithms for a high-order and large-scale tensor
-
parallelizable in distributed environments
-
scalable with the order and size of data; the number of parameters; and the number of machines
-
require several orders of magnitude less memory space than their competitors
CDTF has an advantage in terms of memory usage and flexibility, while
SALS has an advantage in terms of convergence speed
Papers
CDTF and
SALS are described in the following papers:
-
Distributed Methods for High-dimensional and Large-scale Tensor Factorization.
Kijung Shin and U Kang.
IEEE International Conference on Data Mining (ICDM) 2014, Shenzhen, China
[PDF] [BIBTEX]
-
Fully Scalable Methods for Distributed Tensor Factorization.
Kijung Shin, Lee Sael, and U Kang
IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 29, no. 1, pp. 100-113, Jan 2017
[PDF] [Supplementary Document] [BIBTEX]
Code
The source code used in the papers is available.
[ver.1 (ICDM)]
[ver.2 (TKDE)]
They include:
-
CDTF Hadoop version & single machine (multi-thread) version
-
SALS Hadoop version & single machine (multi-thread) version