TY - GEN
T1 - Efficient Sparse Attacks on Videos using Reinforcement Learning
AU - Yan, Huanqian
AU - Wei, Xingxing
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/10/17
Y1 - 2021/10/17
N2 - More and more deep neural network models have been deployed in real-time video systems. However, it is proved that deep models are susceptible to the crafted adversarial examples. The adversarial examples are imperceptible and can make the normal deep models misclassify them. Although there exist a few works aiming at the adversarial examples of video recognition in the black-box attack mode, most of them need large perturbations or hundreds of thousands of queries. There are still lack of effective adversarial methods to produce adversarial videos with small perturbations and limited query numbers at the same time. In this paper, an efficient and powerful method is proposed for adversarial video attacks in the black-box attack mode. The proposed method is based on Reinforcement Learning (RL) like the previous work, i.e. using the agent in RL to adaptively find the sparse key frames to add perturbations. The key difference is that we design the new reward functions based on the loss reduction and the perturbation increment, and thus propose an efficient update mechanism to guide the agent to finish the attacks with smaller perturbations and fewer query numbers. The proposed algorithm has a new working mechanism. It is simple, efficient, and effective. Extensive experiments show our method has a good trade-off between the perturbation amplitude and the query numbers. Compared with the state-of-the-art algorithms, it has reduced 65.75% query numbers without image quality loss in the un-targeted attacks and simultaneously reduced 22.47% perturbations and 54.77% query numbers in the targeted attacks.
AB - More and more deep neural network models have been deployed in real-time video systems. However, it is proved that deep models are susceptible to the crafted adversarial examples. The adversarial examples are imperceptible and can make the normal deep models misclassify them. Although there exist a few works aiming at the adversarial examples of video recognition in the black-box attack mode, most of them need large perturbations or hundreds of thousands of queries. There are still lack of effective adversarial methods to produce adversarial videos with small perturbations and limited query numbers at the same time. In this paper, an efficient and powerful method is proposed for adversarial video attacks in the black-box attack mode. The proposed method is based on Reinforcement Learning (RL) like the previous work, i.e. using the agent in RL to adaptively find the sparse key frames to add perturbations. The key difference is that we design the new reward functions based on the loss reduction and the perturbation increment, and thus propose an efficient update mechanism to guide the agent to finish the attacks with smaller perturbations and fewer query numbers. The proposed algorithm has a new working mechanism. It is simple, efficient, and effective. Extensive experiments show our method has a good trade-off between the perturbation amplitude and the query numbers. Compared with the state-of-the-art algorithms, it has reduced 65.75% query numbers without image quality loss in the un-targeted attacks and simultaneously reduced 22.47% perturbations and 54.77% query numbers in the targeted attacks.
KW - action recognition
KW - adversarial examples
KW - black-box video attack
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85119384689
U2 - 10.1145/3474085.3475395
DO - 10.1145/3474085.3475395
M3 - 会议稿件
AN - SCOPUS:85119384689
T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
SP - 2326
EP - 2334
BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Multimedia, MM 2021
Y2 - 20 October 2021 through 24 October 2021
ER -