Skip to main navigation Skip to search Skip to main content

Importance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data

  • Jie Liu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Synthetic minority oversampling methods have been proven to be an efficient solution for tackling imbalanced data classification issues. Different strategies have been proposed for generating synthetic minority samples. However, noisy samples which may cause the overlapping of minority and majority classes have not yet been properly treated for reducing their influence on the performance of a classification model. A new method, named Importance-SMOTE, is proposed in this paper. In this method, only borderline and edge samples in minority class are oversampled. The synthetic minority samples are generated proportionally to the importance of the minority samples which is calculated according to the composition and distribution of its nearest neighbors. The positions of the synthetic minority samples are determined by the relative importance of the paired neighbors. The proposed method is expected to obtain a more precise estimation of the true decision surface and reduce the influence of noisy samples. Various public imbalanced datasets and a real case study are considered in the experiments to prove the effectiveness of the proposed method.

Original languageEnglish
Pages (from-to)1141-1163
Number of pages23
JournalSoft Computing
Volume26
Issue number3
DOIs
StatePublished - Feb 2022

Keywords

  • Imbalanced data
  • Minority oversampling
  • Noisy samples
  • Overlapped distribution
  • Sample importance

Fingerprint

Dive into the research topics of 'Importance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data'. Together they form a unique fingerprint.

Cite this