TY - JOUR
T1 - AdamRAG
T2 - Adaptive Algorithm with Ravine Method for Training Deep Neural Networks
AU - Zhang, Yifan
AU - Zhao, Di
AU - Li, Hongyi
AU - Pan, Chengwei
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/6
Y1 - 2025/6
N2 - Adaptive optimization algorithms, such as Adam, are widely employed in deep learning. However, because they primarily rely on learning rate adjustments, a trade-off often exists between optimization stability and generalization capability. To address this issue, we propose AdamRAG, a novel optimization algorithm that integrates adaptive methods with Ravine acceleration and momentum techniques, aiming to preserve the stability of adaptive algorithms while enhancing their generalization performance. Within the adaptive framework, AdamRAG introduces extrapolation steps based on Ravine acceleration, which not only accelerate convergence but also prevent the iterative process from becoming trapped in local saddle points, thereby boosting generalization. Simultaneously, the momentum method is employed to regulate the descent step sizes, further improving the algorithm’s stability. Theoretical analysis demonstrates that AdamRAG achieves sublinear convergence in non-convex optimization scenarios. Extensive experiments across tasks such as image classification, natural language processing, and reinforcement learning validate its effectiveness, with results indicating that AdamRAG outperforms established optimizers (e.g., NAG, Adam, Lion) in terms of both convergence speed and generalization performance. Furthermore, sensitivity analysis shows that AdamRAG exhibits greater robustness to variations in learning rate, significantly reducing the need for hyperparameter tuning. These findings suggest that by integrating Ravine acceleration, adaptive methods, and momentum techniques, AdamRAG effectively mitigates the trade-off between stability and generalization, providing an efficient and robust optimization tool for deep learning applications.
AB - Adaptive optimization algorithms, such as Adam, are widely employed in deep learning. However, because they primarily rely on learning rate adjustments, a trade-off often exists between optimization stability and generalization capability. To address this issue, we propose AdamRAG, a novel optimization algorithm that integrates adaptive methods with Ravine acceleration and momentum techniques, aiming to preserve the stability of adaptive algorithms while enhancing their generalization performance. Within the adaptive framework, AdamRAG introduces extrapolation steps based on Ravine acceleration, which not only accelerate convergence but also prevent the iterative process from becoming trapped in local saddle points, thereby boosting generalization. Simultaneously, the momentum method is employed to regulate the descent step sizes, further improving the algorithm’s stability. Theoretical analysis demonstrates that AdamRAG achieves sublinear convergence in non-convex optimization scenarios. Extensive experiments across tasks such as image classification, natural language processing, and reinforcement learning validate its effectiveness, with results indicating that AdamRAG outperforms established optimizers (e.g., NAG, Adam, Lion) in terms of both convergence speed and generalization performance. Furthermore, sensitivity analysis shows that AdamRAG exhibits greater robustness to variations in learning rate, significantly reducing the need for hyperparameter tuning. These findings suggest that by integrating Ravine acceleration, adaptive methods, and momentum techniques, AdamRAG effectively mitigates the trade-off between stability and generalization, providing an efficient and robust optimization tool for deep learning applications.
KW - Adaptive algorithm
KW - Deep learning
KW - Neural networks
KW - Non-convex optimization
KW - Ravine method
UR - https://www.scopus.com/pages/publications/105006761232
U2 - 10.1007/s11063-025-11766-6
DO - 10.1007/s11063-025-11766-6
M3 - 文章
AN - SCOPUS:105006761232
SN - 1370-4621
VL - 57
JO - Neural Processing Letters
JF - Neural Processing Letters
IS - 3
M1 - 53
ER -