TY - GEN
T1 - Exploring the Target Distribution for Surrogate-Based Black-Box Attacks
AU - Moraffah, Raha
AU - Sheth, Paras
AU - Liu, Huan
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Deep Neural Networks are shown to be prone to adversarial attacks. In the black-box setting, where no information about the target is available, surrogate-based black-box attacks train a surrogate on samples queried from the target to imitate the black-box's behavior. The trained surrogate is then attacked to generate adversarial examples. Existing surrogate-based attacks suffer from low success rates because they fail to accurately capture the target's behavior, i.e., their surrogates only mimic the target's outputs for a given set of inputs. Moreover, their attack strategy relies on noisy estimations of high dimensional gradients w.r.t. the inputs (i.e., surrogate's gradients) to generate adversarial examples. Ideally, a successful surrogate-based attack should possess two properties: (1) Train and employ a surrogate that accurately imitates the target behavior for every pair of input and output, i.e., the joint distribution of the target over its input and outputs; and (2) Generate adversarial examples by directly manipulating the class-dependent factors of the input, i.e., factors that affect the target's output, rather than relying on noisy estimations of gradients. We propose a novel surrogate-based attack framework with a surrogate architecture that learns the target distribution over its inputs and outputs while disentangling the class-dependent factors from class-irrelevant ones. The framework is equipped with a novel attack strategy that fully utilizes the target distribution captured by the surrogate while generating adversarial examples by directly manipulating the class-dependent factors. Extensive experiments demonstrate the efficacy of our attack in generating highly successful adversarial examples compared to state-of-the-art methods.
AB - Deep Neural Networks are shown to be prone to adversarial attacks. In the black-box setting, where no information about the target is available, surrogate-based black-box attacks train a surrogate on samples queried from the target to imitate the black-box's behavior. The trained surrogate is then attacked to generate adversarial examples. Existing surrogate-based attacks suffer from low success rates because they fail to accurately capture the target's behavior, i.e., their surrogates only mimic the target's outputs for a given set of inputs. Moreover, their attack strategy relies on noisy estimations of high dimensional gradients w.r.t. the inputs (i.e., surrogate's gradients) to generate adversarial examples. Ideally, a successful surrogate-based attack should possess two properties: (1) Train and employ a surrogate that accurately imitates the target behavior for every pair of input and output, i.e., the joint distribution of the target over its input and outputs; and (2) Generate adversarial examples by directly manipulating the class-dependent factors of the input, i.e., factors that affect the target's output, rather than relying on noisy estimations of gradients. We propose a novel surrogate-based attack framework with a surrogate architecture that learns the target distribution over its inputs and outputs while disentangling the class-dependent factors from class-irrelevant ones. The framework is equipped with a novel attack strategy that fully utilizes the target distribution captured by the surrogate while generating adversarial examples by directly manipulating the class-dependent factors. Extensive experiments demonstrate the efficacy of our attack in generating highly successful adversarial examples compared to state-of-the-art methods.
KW - Black-Box Adversarial Attack
KW - Disentanglement
KW - Model-Stealing
KW - Surrogate-based Attacks
KW - VAE
UR - http://www.scopus.com/inward/record.url?scp=85147905728&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147905728&partnerID=8YFLogxK
U2 - 10.1109/BigData55660.2022.10021089
DO - 10.1109/BigData55660.2022.10021089
M3 - Conference contribution
AN - SCOPUS:85147905728
T3 - Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
SP - 1310
EP - 1315
BT - Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
A2 - Tsumoto, Shusaku
A2 - Ohsawa, Yukio
A2 - Chen, Lei
A2 - Van den Poel, Dirk
A2 - Hu, Xiaohua
A2 - Motomura, Yoichi
A2 - Takagi, Takuya
A2 - Wu, Lingfei
A2 - Xie, Ying
A2 - Abe, Akihiro
A2 - Raghavan, Vijay
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Big Data, Big Data 2022
Y2 - 17 December 2022 through 20 December 2022
ER -