TY - JOUR
T1 - Accelerated Doubly Stochastic Gradient Descent for Tensor CP Decomposition
AU - Wang, Qingsong
AU - Cui, Chunfeng
AU - Han, Deren
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2023/5
Y1 - 2023/5
N2 - In this paper, we focus on the acceleration of doubly stochastic gradient descent method for computing the CANDECOMP/PARAFAC (CP) decomposition of tensors. This optimization problem has N blocks, where N is the order of the tensor. Under the doubly stochastic framework, each block subproblem is solved by the vanilla stochastic gradient method. However, the convergence analysis requires that the variance converges to zero, which is hard to check in practice and may not hold in some implementations. In this paper, we propose accelerating the stochastic gradient method by the momentum acceleration and the variance reduction technique, denoted as DS-MVR. Theoretically, the convergence of DS-MVR only requires the variance to be bounded. Under mild conditions, we show DS-MVR converges to a stochastic ε-stationary solution in O~ (N3 / 2ε- 3) iterations with varying stepsizes and in O(N3 / 2ε- 3) iterations with constant stepsizes, respectively. Numerical experiments on four real-world datasets show that our proposed algorithm can get better results compared with the baselines.
AB - In this paper, we focus on the acceleration of doubly stochastic gradient descent method for computing the CANDECOMP/PARAFAC (CP) decomposition of tensors. This optimization problem has N blocks, where N is the order of the tensor. Under the doubly stochastic framework, each block subproblem is solved by the vanilla stochastic gradient method. However, the convergence analysis requires that the variance converges to zero, which is hard to check in practice and may not hold in some implementations. In this paper, we propose accelerating the stochastic gradient method by the momentum acceleration and the variance reduction technique, denoted as DS-MVR. Theoretically, the convergence of DS-MVR only requires the variance to be bounded. Under mild conditions, we show DS-MVR converges to a stochastic ε-stationary solution in O~ (N3 / 2ε- 3) iterations with varying stepsizes and in O(N3 / 2ε- 3) iterations with constant stepsizes, respectively. Numerical experiments on four real-world datasets show that our proposed algorithm can get better results compared with the baselines.
KW - Doubly stochastic gradient descent
KW - Momentum acceleration
KW - Nonconvex optimization
KW - Tensor CANDECOMP/PARAFAC decomposition
KW - Variance reduction
UR - https://www.scopus.com/pages/publications/85150268804
U2 - 10.1007/s10957-023-02193-5
DO - 10.1007/s10957-023-02193-5
M3 - 文章
AN - SCOPUS:85150268804
SN - 0022-3239
VL - 197
SP - 665
EP - 704
JO - Journal of Optimization Theory and Applications
JF - Journal of Optimization Theory and Applications
IS - 2
ER -