On optimizing distributed non-negative Tucker decomposition


We study the problem of constructing the non-negative Tucker decomposition of sparse tensors on distributed memory systems via the classical non-negative matrix factorization method. We propose three schemes - KronBU, CoreTD, and Hybrid. KronBU is our baseline that uses techniques from the HOOI algorithm for optimization, whreas CoreTD uses new techniques unique to non-negative Tucker. Both techniques use the CSF tree to traverse the data. KronBU traverses the tree bottom up, using Kronecker product to construct the TTM chain; CoreTD traverses the tree top down, using tensor contraction to construct the required data; hybrid uses both techniques to traverse the tree from both the top and the bottom, meeting in the middle to calculate the contribution from each non-zero. Our hybrid algorithm shows up to 4x speedup over KronBU on real-world data sets, taken from the FROSTT repository.

Phoenix, AZ