Adversarial Example Generation Method Based on Sensitive Features

Zerui WEN; Zhidong SHEN; Hui SUN; Baiwen QI

doi:10.1051/wujns/2023281035

All issues

Volume 28 / No 1 (February 2023)

Wuhan Univ. J. Nat. Sci., 28 1 (2023) 35-44

Abstract

Open Access

Issue		Wuhan Univ. J. Nat. Sci. Volume 28, Number 1, February 2023


Page(s)		35 - 44
DOI		https://doi.org/10.1051/wujns/2023281035
Published online		17 March 2023

Wuhan University Journal of Natural Sciences, 2023, Vol.28 No.1, 35-44

Computer Science

CLC number: TP 391.4

Adversarial Example Generation Method Based on Sensitive Features

Zerui WEN¹, Zhidong SHEN¹^,3^†, Hui SUN² and Baiwen QI²

¹ Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430079, Hubei, China
² Zhongnan Hospital, Wuhan University, Wuhan 430072, Hubei, China
³ Engineering Research Center of Cyberspace, Yunnan University, Kunming 650504, Yunnan, China

^† To whom correspondence should be addressed. E-mail: shenzd@whu.edu.cn

Received: 6 September 2022

Abstract

As deep learning models have made remarkable strides in numerous fields, a variety of adversarial attack methods have emerged to interfere with deep learning models. Adversarial examples apply a minute perturbation to the original image, which is inconceivable to the human but produces a massive error in the deep learning model. Existing attack methods have achieved good results when the network structure is known. However, in the case of unknown network structures, the effectiveness of the attacks still needs to be improved. Therefore, transfer-based attacks are now very popular because of their convenience and practicality, allowing adversarial samples generated on known models to be used in attacks on unknown models. In this paper, we extract sensitive features by Grad-CAM and propose two single-step attacks methods and a multi-step attack method to corrupt sensitive features. In two single-step attacks, one corrupts the features extracted from a single model and the other corrupts the features extracted from multiple models. In multi-step attack, our method improves the existing attack method, thus enhancing the adversarial sample transferability to achieve better results on unknown models. Our method is also validated on CIFAR-10 and MINST, and achieves a 1%-3% improvement in transferability.

Key words: deep learning model / adversarial example / transferability / sensitive characteristics / AI security

Biography: WEN Zerui, male, Master candidate, research direction: AI security and adversarial attack. E-mail: zeruiwen2018@163.com

Fundation item: Supported by the Key R&D Projects in Hubei Province (2022BAA041 and 2021BCA124) and the Open Foundation of Engineering Research Center of Cyberspace (KJAQ202112002)

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.