TY - GEN
T1 - DropMix
T2 - 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
AU - Kong, Fanshuang
AU - Zhang, Richong
AU - Guo, Xiaohui
AU - Mensah, Samuel
AU - Mao, Yongyi
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Overfitting is a common problem when there is insufficient data to train deep neural networks in machine learning tasks. Data augmentation regularization methods such as Dropout, Mixup, and their enhanced variants, are effective and prevalent, and achieve promising performance to overcome overfitting. However, in text learning, most of the existing regularization approaches merely adopt ideas from computer vision without considering the importance of dimensionality in natural language processing. In this paper, we argue that the property is essential to overcome overfitting in text learning. Accordingly, we present a saliency map informed textual data augmentation and regularization framework, which combines Dropout and Mixup, namely DropMix, to mitigate the overfitting problem in text learning. In addition, we design a procedure that drops and patches fine grained shapes of the saliency map under the DropMix framework to enhance regularization. Empirical studies confirm the effectiveness of the proposed approach on 12 text classification tasks.
AB - Overfitting is a common problem when there is insufficient data to train deep neural networks in machine learning tasks. Data augmentation regularization methods such as Dropout, Mixup, and their enhanced variants, are effective and prevalent, and achieve promising performance to overcome overfitting. However, in text learning, most of the existing regularization approaches merely adopt ideas from computer vision without considering the importance of dimensionality in natural language processing. In this paper, we argue that the property is essential to overcome overfitting in text learning. Accordingly, we present a saliency map informed textual data augmentation and regularization framework, which combines Dropout and Mixup, namely DropMix, to mitigate the overfitting problem in text learning. In addition, we design a procedure that drops and patches fine grained shapes of the saliency map under the DropMix framework to enhance regularization. Empirical studies confirm the effectiveness of the proposed approach on 12 text classification tasks.
UR - https://www.scopus.com/pages/publications/85149434353
U2 - 10.18653/v1/2022.emnlp-main.57
DO - 10.18653/v1/2022.emnlp-main.57
M3 - 会议稿件
AN - SCOPUS:85149434353
T3 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
SP - 890
EP - 899
BT - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
A2 - Goldberg, Yoav
A2 - Kozareva, Zornitsa
A2 - Zhang, Yue
PB - Association for Computational Linguistics (ACL)
Y2 - 7 December 2022 through 11 December 2022
ER -