当前位置:网站首页>On the confrontation samples and their generation methods in deep learning
On the confrontation samples and their generation methods in deep learning
2022-07-02 07:58:00 【MezereonXP】
List of articles
Deep learning model is widely used in various fields , Like image classification , natural language processing , Autopilot, etc . With ResNet,VGG A series of deep networks represented by has achieved good results in these fields , Even beyond the human level . However ,Szegedy Et al. 2014 Years of work (Intriguing properties of neural networks) It reveals the vulnerability of deep network (vulnerability), That is, make some small disturbance on the input (perturbation) You can make a trained model output wrong results , Take the following classic panda picture as an example :
You can see , The panda picture on the left , By model 57.7% The degree of setting (confidence) It is classified as panda , Add a small noise , Although the picture looks like a panda to the naked eye , But it was modeled as 99.3% It is classified as gibbon (gibbon).
This noisy sample is called Counter samples (Adversarial Example), The method of generating countermeasure samples belongs to a kind of attack .
Such attacks are generally , It can be divided into two categories :
- Black box attack
- White box attack
Black box attack generally assumes that the attacker cannot interfere with the training process , I don't know the specific parameters of the model , Only the final output can be obtained , namely softmax Probability vector after layer .
The white box attack generally believes that the attacker can obtain the specific parameters of the model , Including the weight of each convolution kernel .
After sorting black boxes and white boxes , The attack can be further divided :
- Targeted attack (targeted attack)
- An aimless attack (untargeted attack)
A targeted attack means that the attacker has a specific class , Hope to generate confrontation samples , Make the model classified into that specific category .
And a non target attack would be simpler , Just make the classification result of the model wrong , It doesn't matter which category it is divided into .
primary coverage
This article mainly reviews some famous attacks in recent years , Include :FGSM,JSMA,C&W,PGD,MIM,BIM,EAD. Talk about the idea of these attacks and some details , And take some recent defense work as examples , Look at the effect of these attacks .
FGSM The full name is Fast Gradient Sign Method, yes 2015 year Goodfellow Et al. ICLR15:Explaining and harnessing adversarial examples The one mentioned in , Generate confrontation samples through the symbols of gradients , The core formula is :
x a d v = x + ε ⋅ s i g n ( ∇ x ( J ( x , y ) ) ) x^{adv} = x + \varepsilon\cdot sign(\nabla_x(J(x,y))) xadv=x+ε⋅sign(∇x(J(x,y)))
among , x , y x,y x,y Clean samples and corresponding label, there label Refer to one hot vector .
function J ( ⋅ , ⋅ ) J(\cdot, \cdot) J(⋅,⋅) It's a cross entropy function (cross-entropy), x a d v x^{adv} xadv It is x x x Corresponding confrontation samples .
function s i g n ( ⋅ ) sign(\cdot) sign(⋅) It's a symbolic function , A positive number returns 1, A negative number returns -1,0 return 0.
The cross entropy function is usually used in our training , Optimize as the final loss function , Here we directly use the gradient of the loss function , Follow the idea of back propagation , Gradient the sample , Make the loss function larger . Notice here when we calculate the gradient of the cross entropy function , Finally, only take the symbol , Represents a direction of change . Parameters ε \varepsilon ε Control the noise , If it's too big, human eyes can't distinguish , It's not a confrontation sample , Generally, it may be set to 8/255.
JSMA The full name is Jacobian-based Saliency Map Attack, yes 2016 Year by year papernot Et al. “The Limitations of Deep Learning in Adversarial Settings” The one mentioned in . The idea is mainly to use a heat map , That is, in the method name Saliency Map To guide the generation of countermeasure samples . Give its core formula :
S ( x , y ) [ i ] = { 0 , i f ∂ F ( x ) y ∂ x i < 0 o r ∑ j ≠ y ∂ F ( x ) j ∂ x i > 0 ( ∂ F ( x ) y ∂ x i ) ∣ ∑ j ≠ y ∂ F ( x ) j ∂ x i ∣ , o t h e r w i s e S(x,y)[i]=\left\{ \begin{matrix} &0, &if\ \frac{\partial F(x)_y}{\partial x_i} \lt 0 \ or \sum_{j\neq y} \frac{\partial F(x)_j}{\partial x_i} >0\\ &(\frac{\partial F(x)_y}{\partial x_i})|\sum_{j\neq y} \frac{\partial F(x)_j}{\partial x_i}|, & otherwise\end{matrix} \right. S(x,y)[i]={ 0,(∂xi∂F(x)y)∣∑j=y∂xi∂F(x)j∣,if ∂xi∂F(x)y<0 or∑j=y∂xi∂F(x)j>0otherwise
This is the generation method of thermodynamic diagram , among F ( x ) F(x) F(x) Represents the output of the model , and F ( x ) j F(x)_j F(x)j It refers to the j j j The number of positions , Usually it also means j j j Confidence level of two categories . x i x_i xi It can be understood as the output of i i i The value of a pixel .
The meaning of this thermodynamic diagram is , Once the noise is added to the pixel , Cannot improve the confidence of other categories ( The partial derivative is greater than 0) Or the confidence of the real category cannot be reduced ( The partial derivative is less than 0), Then do not operate the pixel . In other cases , Then use the product of two partial derivatives as the value , Represents the degree of its influence .
stay JSMA in , Often calculate the thermodynamic diagram first , Then select the pixel with the largest thermal value to modify , Iterate repeatedly until the attack succeeds or the number of operable pixels reaches the threshold .
C&W It's a combination of the initials of two people's names , namely Carlini and Wagner stay 2017 Year of “Towards Evaluating the Robustness of Neural Networks” The one mentioned in .
For previous attack forms , It can be expressed as :
m i n i m i z e D ( x , x + δ ) s . t . C ( x + δ ) = t x + δ ∈ [ 0 , 1 ] n minimize\ D(x, x+\delta) \\ s.t. \ \ C(x+\delta) = t \\ x + \delta \in [0,1]^n minimize D(x,x+δ)s.t. C(x+δ)=tx+δ∈[0,1]n
among D ( x , x + δ ) D(x,x+\delta) D(x,x+δ) Express x x x and x + δ x+\delta x+δ Distance between
The meaning of this formula is , Find a minimum noise , Make the classification result the target classification t t t
But in this expression C ( x + δ ) = t C(x+\delta) = t C(x+δ)=t Too nonlinear (Highly Non-Linear), It may cause problems in optimization .
So it is improved to
m i n i m i z e D ( x , x + δ ) s . t . f ( x + δ ) ≤ 0 x + δ ∈ [ 0 , 1 ] n minimize\ D(x, x+\delta) \\ s.t. \ \ f(x+\delta) \leq 0 \\ x + \delta \in [0,1]^n minimize D(x,x+δ)s.t. f(x+δ)≤0x+δ∈[0,1]n
further , Expressed as
m i n i m i z e D ( x , x + δ ) + c ⋅ f ( x + δ ) s . t . x + δ ∈ [ 0 , 1 ] n minimize\ D(x, x+\delta) + c\cdot f(x+\delta) \\ s.t. \ \ x + \delta \in [0,1]^n minimize D(x,x+δ)+c⋅f(x+δ)s.t. x+δ∈[0,1]n
among c > 0 c>0 c>0
At the same time x + δ x+\delta x+δ To transform , Make x + δ = 1 2 ( t a n h ( ω ) + 1 ) x+\delta=\frac{1}{2}(tanh(\omega)+1) x+δ=21(tanh(ω)+1)
because 0 ≤ x + δ ≤ 1 0\leq x+\delta\leq 1 0≤x+δ≤1, meanwhile − 1 ≤ t a n h ( ω ) ≤ 1 -1\leq tanh(\omega)\leq 1 −1≤tanh(ω)≤1
Last , Give the core formula of its attack :
f ( x ′ ) = m a x ( m a x { Z ( x ′ ) i : i ≠ t } − Z ( x ′ ) t , − k ) f(x')=max(max\{Z(x')_i:i\neq t\}-Z(x')_t, -k) f(x′)=max(max{ Z(x′)i:i=t}−Z(x′)t,−k)
among Z ( x ′ ) Z(x') Z(x′) by softmax The vector before the layer , f ( x ) f(x) f(x) Used as an objective function for optimization , The whole optimization goal is
m i n i m i z e ∣ ∣ 1 2 ( t a n h ( ω ) + 1 ) − x ∣ ∣ + c ⋅ f ( 1 2 ( t a n h ( ω ) + 1 ) ) minimize\ \ ||\frac{1}{2}(tanh(\omega)+1)-x||+c\cdot f(\frac{1}{2}(tanh(\omega)+1)) minimize ∣∣21(tanh(ω)+1)−x∣∣+c⋅f(21(tanh(ω)+1))
PGD The full name is Projected Gradient Descent, By Madry Et al. 2019 Year of “Towards Deep Learning Models Resistant to Adversarial Attacks” The one mentioned in . The idea is similar to FGSM Multiple iterations , Form the following :
x ( t + 1 ) = Π x + S ( x ( t ) + α ⋅ s i g n ( ∇ x ( J ( x , y ) ) ) ) x^{(t+1)}=\Pi_{x+\mathcal{S}}(x^{(t)}+\alpha\cdot sign(\nabla_x(J(x,y)))) x(t+1)=Πx+S(x(t)+α⋅sign(∇x(J(x,y))))
The focus is on one of the projection operations Π x + S ( ⋅ ) \Pi_{x+\mathcal{S}}(\cdot) Πx+S(⋅), take x x x The modified value is mapped to its neighborhood .
BIM The full name is Basic Iterative Method, from Kurakin And other people in 2017 Year of “Adversarial examples in the physical world” It was put forward in the book . The principle is , First find a category with the lowest classification setting , Calculate the gradient along the direction of this category , And then get the corresponding countermeasure samples .
It defines a category called iterative minimum trust (iterative least-likely class):
y L L = a r g m i n y { p ( y ∣ x ) } y_{LL}=argmin_y\{p(y|x)\} yLL=argminy{ p(y∣x)}
Its core formula is similar to the iterative form FGSM, as follows :
x n + 1 a d v = c l i p x , ε ( x n a d v + α ⋅ s i g n ( ∇ x ( J ( x n a d v , y L L ) ) ) ) x_{n+1}^{adv}=clip_{x,\varepsilon}(x_n^{adv}+\alpha\cdot sign(\nabla_x(J(x_n^{adv},y_{LL})))) xn+1adv=clipx,ε(xnadv+α⋅sign(∇x(J(xnadv,yLL))))
c l i p x , ε ( ⋅ ) clip_{x,\varepsilon}(\cdot) clipx,ε(⋅) The function is used to truncate , So that the overall noise does not exceed the threshold ε \varepsilon ε.
MIM The full name is Momentum Iterative Method, Yes, there is Dong Et al. 2018 Year of “Boosting Adversarial Attacks with Momentum” The one mentioned in , stay FGSM On the basis of , Iteration and momentum terms are added , Form the following :
g t + 1 = μ ⋅ g t + ∇ x ( J ( x t , y ) ) ∣ ∣ ∇ x ( J ( x t , y ) ) ∣ ∣ 1 x t + 1 = x t + α ⋅ s i g n ( g t + 1 ) g_{t+1} = \mu\cdot g_{t} + \frac{\nabla_x(J(x_t,y))}{||\nabla_x(J(x_t, y))||_1} \\ x_{t+1}=x_t+\alpha\cdot sign(g_{t+1}) gt+1=μ⋅gt+∣∣∇x(J(xt,y))∣∣1∇x(J(xt,y))xt+1=xt+α⋅sign(gt+1)
EAD The full name is Elastic-Net Attacks to DNNs, By chen Et al. 2018 Year of “EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples” Proposed in .
It is also in the form of iteration , It's kind of similar MIM and C&W The combination of .
Its form is as follows :
x ( k + 1 ) = S β ( y ( k ) − α k ∇ ( g ( y ( k ) ) ) ) y ( k + 1 ) = x ( k + 1 ) + k k + 3 ( x ( k + 1 ) − x ( k ) ) x^{(k+1)}=S_{\beta}(y^{(k)}-\alpha_k\nabla(g(y^{(k)}))) \\ y^{(k+1)}=x^{(k+1)}+\frac{k}{k+3}(x^{(k+1)}-x^{(k)}) x(k+1)=Sβ(y(k)−αk∇(g(y(k))))y(k+1)=x(k+1)+k+3k(x(k+1)−x(k))
among , g ( x ) = c ⋅ f ( x ) + ∣ ∣ x − x 0 ∣ ∣ 2 g(x) = c\cdot f(x) + ||x-x_0||_2 g(x)=c⋅f(x)+∣∣x−x0∣∣2
meanwhile , f ( x ) = m a x ( m a x { Z ( x ) i : i ≠ t } − Z ( x ) t , − k ) f(x)=max(max\{Z(x)_i:i\neq t\}-Z(x)_t, -k) f(x)=max(max{ Z(x)i:i=t}−Z(x)t,−k), among Z ( x ) Z(x) Z(x) by softmax The vector before the layer .
also , [ S β ( z ) ] i = { m i n { z i − β , 1 } i f z i − x 0 i > β x 0 i i f ∣ z i − x 0 i ∣ < β m a x { z i + β , 0 } i f z i − x 0 i < − β [S_\beta(z)]_i=\left\{ \begin{matrix} min\{z_i-\beta, 1\} & if \ z_i-x_{0i} > \beta \\ x_{0i} &if |z_i-x_{0i}|<\beta \\ max\{z_i+\beta, 0\} & if\ z_i-x_{0i}<-\beta \end{matrix}\right. [Sβ(z)]i=⎩⎨⎧min{ zi−β,1}x0imax{ zi+β,0}if zi−x0i>βif∣zi−x0i∣<βif zi−x0i<−β
S β ( ⋅ ) S_\beta(\cdot) Sβ(⋅) Functions are essentially antagonistic samples of constructs and clean inputs x 0 x_0 x0 Compare and compress , Compress the variation range to 0 and 1 Between .
Performance comparison of attacks under existing defense
Start with ICML19,Jun Zhu Of the group Improving Adversarial Robustness via Promoting Ensemble Diversity For example , This is a way of using the relationship between models diversity To realize the defense means of confrontation robustness .
The value in this table is the classification accuracy , You can see in the MNIST On dataset ,PGD Attack seems to be the most effective ,BIM second . stay CIFAR-10 On dataset ,JSMA、BIM、PGD It's all pretty good .
For simple datasets MNIST In the case of noise limitation , The defense effect is pretty good .
For more complex data sets CIFAR-10 The performance is still unsatisfactory .
- The hystrix dashboard reported an error hystrix Stream is not in the allowed list of proxy host names solution
- Ppt skills
- 【Sparse-to-Dense】《Sparse-to-Dense:Depth Prediction from Sparse Depth Samples and a Single Image》
- Open3d learning notes II [file reading and writing]
- Label propagation
- 【Mixed Pooling】《Mixed Pooling for Convolutional Neural Networks》
- label propagation 标签传播
- How to clean up logs on notebook computers to improve the response speed of web pages
- Deep learning classification Optimization Practice
- Pointnet understanding (step 4 of pointnet Implementation)
Pointnet understanding (step 4 of pointnet Implementation)
In the era of short video, how to ensure that works are more popular?
【FastDepth】《FastDepth:Fast Monocular Depth Estimation on Embedded Systems》
[multimodal] clip model
[learning notes] matlab self compiled Gaussian smoother +sobel operator derivation
【Paper Reading】
【Cutout】《Improved Regularization of Convolutional Neural Networks with Cutout》
Memory model of program
Solve the problem of latex picture floating
【Sparse-to-Dense】《Sparse-to-Dense:Depth Prediction from Sparse Depth Samples and a Single Image》
【Wing Loss】《Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks》
Meta Learning 简述
解决jetson nano安装onnx错误(ERROR: Failed building wheel for onnx)总结
EKLAVYA -- 利用神经网络推断二进制文件中函数的参数
【Mixed Pooling】《Mixed Pooling for Convolutional Neural Networks》
【Sparse-to-Dense】《Sparse-to-Dense:Depth Prediction from Sparse Depth Samples and a Single Image》
[mixup] mixup: Beyond Imperial Risk Minimization
用于类别增量学习的动态可扩展表征 -- DER
Apple added the first iPad with lightning interface to the list of retro products
(15) Flick custom source