Overcoming Catastrophic Forgetting in Continual Learning by Exploring Eigenvalues of Hessian Matrix

  • Yajing Kong
  • , Liu Liu
  • , Huanhuan Chen
  • , Janusz Kacprzyk
  • , Dacheng Tao*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Neural networks tend to suffer performance deterioration on previous tasks when they are applied to multiple tasks sequentially without access to previous data. The problem is commonly known as catastrophic forgetting, a significant challenge in continual learning (CL). To overcome the catastrophic forgetting, regularization-based CL methods construct a regularization-based term, which can be considered as the approximation loss function of previous tasks, to penalize the update of parameters. However, the rigorous theoretical analysis of regularization-based methods is limited. Therefore, we theoretically analyze the forgetting and the convergence properties of regularization-based methods. The theoretical results demonstrate that the upper bound of the forgetting has a relationship with the maximum eigenvalue of the Hessian matrix. Hence, to decrease the upper bound of the forgetting, we propose eiGenvalues ExplorAtion Regularization-based (GEAR) method, which explores the geometric properties of the approximation loss of prior tasks regarding the maximum eigenvalue. Extensive experimental results demonstrate that our method mitigates catastrophic forgetting and outperforms existing regularization-based methods.

Original languageEnglish
Pages (from-to)16196-16210
Number of pages15
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume35
Issue number11
DOIs
StatePublished - 2024
Externally publishedYes

Keywords

  • Catastrophic forgetting
  • continual learning (CL)
  • incremental learning
  • lifelong learning

Fingerprint

Dive into the research topics of 'Overcoming Catastrophic Forgetting in Continual Learning by Exploring Eigenvalues of Hessian Matrix'. Together they form a unique fingerprint.

Cite this