跳到主要导航 跳到搜索 跳到主要内容

Adversarial Word Dilution as Text Data Augmentation in Low-Resource Regime

  • Zhongguancun Laboratory
  • Beihang University
  • University of Ottawa

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Data augmentation is widely used in text classification, especially in the low-resource regime where a few examples for each class are available during training. Despite the success, generating data augmentations as hard positive examples that may increase their effectiveness is under-explored. This paper proposes an Adversarial Word Dilution (AWD) method that can generate hard positive examples as text data augmentations to train the low-resource text classification model efficiently. Our idea of augmenting the text data is to dilute the embedding of strong positive words by weighted mixing with unknown-word embedding, making the augmented inputs hard to be recognized as positive by the classification model. We adversarially learn the dilution weights through a constrained min-max optimization process with the guidance of the labels. Empirical studies on three benchmark datasets show that AWD can generate more effective data augmentations and outperform the state-of-the-art text data augmentation methods. The additional analysis demonstrates that the data augmentations generated by AWD are interpretable and can flexibly extend to new examples without further training.

源语言英语
主期刊名AAAI-23 Technical Tracks 11
编辑Brian Williams, Yiling Chen, Jennifer Neville
出版商AAAI press
12626-12634
页数9
ISBN(电子版)9781577358800
DOI
出版状态已出版 - 27 6月 2023
活动37th AAAI Conference on Artificial Intelligence, AAAI 2023 - Washington, 美国
期限: 7 2月 202314 2月 2023

出版系列

姓名Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023
37

会议

会议37th AAAI Conference on Artificial Intelligence, AAAI 2023
国家/地区美国
Washington
时期7/02/2314/02/23

指纹

探究 'Adversarial Word Dilution as Text Data Augmentation in Low-Resource Regime' 的科研主题。它们共同构成独一无二的指纹。

引用此