poisoned-based-backdoor-attack-suvery

编辑日期: 2024-07-07 文章阅读: 次

Backdoor is also commonly called neural trojan or trojan. In this survey, Backdoor Learning: A Survey by Yiming Li uses “backdoor” instead of other terms since it is most frequently used.

Backdoor attack types

Taxonomy of poisoning-based backdoor attacks with different categorization criteria. In this figure, the red boxes represent categorization criteria, while the blue boxes indicate attack subcategories.

Reference by the paper Backdoor Learning: A Survey by Yiming Li

Part1

Backdoor attack types

Part 2

Backdoor attack types

Table

SUMMARY OF EXISTING POISONING-BASED BACKDOOR ATTACKS

Example of poisoned samples generated by different types of backdoor attacks:

Example of poisoned samples generated by different types of backdoor attacks

First backdoor attack

BadNets, Gu et al (BadNets: Evaluating Backdooring Attacks on Deep Neural Networks). introduced the first backdoor attack in deep learning by poisoning some training samples. This method was called BadNets.

First Invisible Backdoor Attacks

Invisible Backdoor Attacks, Chen et al (Targeted backdoor attacks on deep learning systems using data poisoning), first discussed the invisibility requirement of poisoning-based back- door attacks. They suggested that the poisoned image should be indistinguishable compared from its benign version to evade human inspection. To fulfill this requirement, they proposed a blended strategy.

Optimized Backdoor Attacks

Triggers are the core of poisoning-based attacks. As such, analyzing how to design a better trigger instead of simply using a given nonop- timized patch is of great significance and has attracted some attention. Liu et al. (Trojaning attack on neural networks) first explored this problem, where they proposed to optimize the trigger so that the important neurons can achieve the maximum values.

First semantic backdoor attacks

Bagdasaryan et al, first explored this problem and proposed a novel type of backdoor attacks (Blind Backdoors in Deep Learning Models), (How to backdoor federated learning), i.e., the semantic backdoor attacks. Specifically, they demonstrated that assigning an attacker-chosen label to all images with certain features, e.g., green cars or cars with racing stripes, for training can create semantic backdoors in the infected DNNs.

Accordingly, the infected model will automatically misclassify testing images containing predefined semantic information without any image modification.

First sample-specific backdoor attack

Nguyen and Tran (Input-aware dynamic backdoor attack) proposed the first sample-specific backdoor attack, where different poisoned samples contain different trigger patterns.

First physical backdoor attack

Chen et al. (Targeted back- door attacks on deep learning systems using data poisoning) first explored the landscape of this attack, where they adopted a pair of glasses as the physical trigger to mislead the infected face recognition system developed in a camera. Further exploration of attacking face recognition in the physical world was also discussed by Wenger et al (Backdoor attacks against deep learning systems in the physical world,). A similar idea was also discussed in (BadNets: Evaluating Backdooring Attacks on Deep Neural Networks), where a post-it note was adopted as the trigger in attacking traffic sign recognition deployed in the camera. Recently, Li et al.(Backdoor Attack in the Physical World) demonstrated that existing digital attacks fail in the physical world since the involved transformations (e.g., rotation, and shrinkage) change the location and appearance of triggers in attacked samples.

Black-Box Backdoor Attacks

Black-Box Backdoor Attacks: Different from previous white-box attacks which required to access the training samples, black-box attacks adopted the settings that the training set is inaccessible. In practice, the training dataset is usually not shared due to privacy or copyright concerns; therefore, black-box attacks are more realistic than white-box ones. In general, black-box backdoor attackers generated some sub- stitute training samples at first. For example, in (Trojaning attack on neural networks), attackers generated some representative images of each class by opti- mizing images initialized from another dataset such that the prediction confidence of the selected class reaches maximum. With the substitute training samples, white-box attacks can be adopted for backdoor injection. Black-box backdoor attacks are significantly more difficult than white-box ones and there were only a few works in this area.

Dataset

summary of benchmark datasets used in image recognition

SUMMARY OF BENCHMARK DATASETS USED IN IMAGE RECOGNITION

Site Views: Visitors:

AI之家

🔥AI副业赚钱星球

点击下面图片查看

🔥ChatGPT-4在线使用

Python和AI在线练习

AI之家教程

poisoned-based-backdoor-attack-suvery

Backdoor attack types

Part1

Part 2

Table

First backdoor attack

First Invisible Backdoor Attacks

Optimized Backdoor Attacks

First semantic backdoor attacks

First sample-specific backdoor attack

First physical backdoor attack

Black-Box Backdoor Attacks

Dataset