Fuzz testing based data augmentation to improve robustness of deep neural networks

  • Xiang Gao
  • , Ripon K. Saha
  • , Mukul R. Prasad
  • , Abhik Roychoudhury

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep neural networks (DNN) have been shown to be notoriously brittle to small perturbations in their input data. This problem is analogous to the over-fitting problem in test-based program synthesis and automatic program repair, which is a consequence of the incomplete specification, i.e., the limited tests or training examples, that the program synthesis or repair algorithm has to learn from. Recently, test generation techniques have been successfully employed to augment existing specifications of intended program behavior, to improve the generalizability of program synthesis and repair. Inspired by these approaches, in this paper, we propose a technique that re-purposes software testing methods, specifically mutation-based fuzzing, to augment the training data of DNNs, with the objective of enhancing their robustness. Our technique casts the DNN data augmentation problem as an optimization problem. It uses genetic search to generate the most suitable variant of an input data to use for training the DNN, while simultaneously identifying opportunities to accelerate training by skipping augmentation in many instances. We instantiate this technique in two tools, Sensei and Sensei-SA, and evaluate them on 15 DNN models spanning 5 popular image data-sets. Our evaluation shows that Sensei can improve the robust accuracy of the DNN, compared to the state of the art, on each of the 15 models, by upto 11.9% and 5.5% on average. Further, Sensei-SA can reduce the average DNN training time by 25%, while still improving robust accuracy.

Original languageEnglish
Title of host publicationProceedings - 2020 ACM/IEEE 42nd International Conference on Software Engineering, ICSE 2020
PublisherIEEE Computer Society
Pages1147-1158
Number of pages12
ISBN (Electronic)9781450371216
DOIs
StatePublished - 27 Jun 2020
Externally publishedYes
Event42nd ACM/IEEE International Conference on Software Engineering, ICSE 2020 - Virtual, Online, Korea, Republic of
Duration: 27 Jun 202019 Jul 2020

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

Conference42nd ACM/IEEE International Conference on Software Engineering, ICSE 2020
Country/TerritoryKorea, Republic of
CityVirtual, Online
Period27/06/2019/07/20

Keywords

  • Data augmentation
  • Dnn
  • Genetic algorithm
  • Robustness

Fingerprint

Dive into the research topics of 'Fuzz testing based data augmentation to improve robustness of deep neural networks'. Together they form a unique fingerprint.

Cite this