当前位置：网站首页>Research on threat analysis and defense methods of deep learning data theft attack in data sandbox mode

The age of artificial intelligence , Open data sharing has become a trend , But the problem of data security seriously restricts the value of big data . Data sandbox mode , Or data trust , It is an effective solution to the contradiction between privacy protection and data mining . Data sandbox is divided into debugging environment and running environment , The data owner will host the original data into the running environment , And automatically generate sample data that does not contain privacy information .

Two 、 The problems studied in this paper

The data sandbox mode is analyzed in detail , Deeply learn the threat model of data theft attack , The damage degree and identification characteristics of attacks in data processing stage and model training stage are quantitatively evaluated . Attacks against the data processing phase , Propose a data leakage prevention method based on model pruning , Reduce data leakage on the premise of ensuring the availability of the original model ; Attack on model training stage , An attack detection method based on model parameter analysis is proposed , So as to intercept malicious models and prevent data leakage .

3、 ... and 、 Innovation of the paper

Propose a data leakage prevention method based on model pruning 、 An attack detection method based on model parameter analysis is proposed

Four 、 Mind mapping

5、 ... and 、 The main technical points are

（1） Data sandbox

Data sandbox is divided into debugging environment and running environment , The data owner will host the original data into the running environment , And automatically generate sample data that does not contain privacy information . The data analyst writes according to the sample data in the debugging environment AI Model training code , And send it to the running environment . The code analyzes the full amount of raw data in the running environment , Finally get high availability AI Model , Return to the data analyst . In this process , The data analyst has no direct contact with the original data , It's realized again AI Full training of the model on full data .

（2） Data enhancement operations

Sometimes it is to expand the data , Improve the training effect , Some processing of raw data , Input the original data into the data enhancement function , Output some new data . This operation is data enhancement , For example, for image data, such as cropping and rotation .

（3） Optimization function

among , function L Loss function for deep learning , For evaluation function fθ Yes xi The judgment result is consistent with the real tag value yi The gap between . Φ ( θ ) It's a regular term , Usually used to prevent AI The model has been fitted .AI The process of model training optimization can be understood as based on training data D' Constantly optimize functions fθ , And then the process of reducing the loss function . After model training , Data analysts can extract trained models from the data sandbox fθ , It's used for specific AI Judgment task .

（4） Regular term function

Φ (θ ) Usually used to prevent AI The model has been fitted

（5） Threat model

In data sandbox mode , Normal data analysts can only write code through sample data and cannot touch the full amount of data , Therefore, it is impossible to directly copy data or steal key information . However , In the data processing stage and model training stage of the operating environment , The analyst's code acts directly on the full amount of data , Therefore, there is still the possibility of data leakage

There are two threat models of data theft attack in AI ：

One is to add malicious regular items to the model training code , That is, the carefully constructed regular term function , The training result of such a model is malicious AI Model , The parameters of this model will carry the original data , Will expose the original training data .

The other is generating malicious data in the process of data processing code , The attacker modifies the data enhancement function , Build some carefully constructed data , Make the model label value with original data information , Then the attacker constructs the same malicious input , Get the output , The output information contains the original data information .

（6） Attack during model training ：

Attack based on regularization function

The above figure is a maliciously constructed regularization function , Is the model parameter , It's the average , And is to extract the string of real numbers and their average values from the eigenvalues of the training data , The number of elements in the string is the same as the number of parameters in the model parameters .

Can be found and consistent , That is, it has relevance , When it is highly correlated with , The loss function will decrease , When the correlation between the two is not high , The loss function will increase . This feature , From the output , Get the value of the original data information .

The second attack is against the model training stage , Using symbols to encode the eigenvalues of training data . For example, the training data is pictures （ contain [0, 255] Pixels of ）, This attack method encodes a pixel in the picture as It means 8 A string . The total length of the string that can be encoded is equal to the parameter The number of elements in k. Under this coding method , The attacker modifies the regular item of the training process to ：

When the symbol of and is different , Positive value , Make the loss function increase , Make the sign of Hede develop in the same direction , After training , The attacker extracts the symbols of the parameters , Every time 8 Elements decode a pixel value , Then restore the picture .

（7） Attacks in the data processing phase ：

Extract the pixel value in the picture data and compress it , Split pixel value , Then generate malicious data .

The specific algorithm is ：

Define the encoded image number as u, The encoded pixels are Pij, The height of a single picture is H、 Wide for W、 The number of channels is C、 The number of pixels in a single channel is N=H×W, Initialize malicious image x1 and x2 by C×N Of all the 0 matrix .

Generate malicious data DM after , Fuse it with the original training data , Training AI Model fθ . The attacker obtains from the data sandbox AI Model fθ , Apply the same malicious data generation algorithm locally to generate DM, And input it into fθ in , Get the tag value encoded with the original training data , Then recover the original training data .

（8） Deep learning data theft attack characteristics analysis （ test method ）

The model parameters trained in data theft attack are analyzed in detail . During normal training , The regular term of deep learning loss function is generally chosen L1 Paradigm or L2 normal form .

L1 Paradigm or L2 Under the constraints of paradigm , The distribution of model parameters usually belongs to normal distribution . The malicious model training process introduces regular terms related to data or data coding , Therefore, the distribution of model parameters may change .

It can be seen that , The attack effect of this algorithm on the data processing stage is not obvious , It has obvious effect on the model training stage .

（9） Parameter analysis and detection of attack in model training stage

Aiming at the difference of parameter distribution between malicious training model and normal training model , This paper proposes a method to extract the key eigenvalues of parameters to automatically distinguish malicious models from normal models . For any input model parameters , Firstly, this paper statistically analyzes the distribution characteristics of one of the convolution parameters , It mainly includes the minimum value of parameters 、 Maximum 、 Number 、 mean value 、 Variance, etc . To evaluate the model parameters Whether it conforms to the normal distribution , This paper introduces skewness S And kurtosis K 2 A statistical concept

（10） Model pruning defense of data processing stage attack

Prune the neurons with malicious data while retaining other neurons , It can defend against data theft attacks without losing model accuracy . This paper uses normal data to test the neuron activation value of the hidden layer of the model , Prune the neurons with small activation value in the prediction process of normal test data , This pruning has little impact on the original task of the deep learning model , And more likely to carry training data . Specific steps ：

6、 ... and 、 Summary and Prospect

This article introduces the problem of data leakage under the sandbox model , Data theft attack against sandbox model and defense means against theft attack are proposed . There are two ways to attack , One is to use the attack in the model training stage , Build some malicious training models , The main method is to build malicious regularization functions , Or construct some malicious coding methods . The other is the attack in the data processing stage , Mainly modify the data enhancement function , Construct some malicious data , These malicious data contain the data information of the training set . Methods of Defense , The first is to analyze the characteristics of data theft , Analyze whether the distribution of parameters is normal , And design an automatic detection function according to the parameters . The second is to design a defense method based on model pruning , It can be used in the data processing stage .

But this article also has some shortcomings , First of all, let's talk about this article Attack detection based on model parameter analysis , Due to the application of machine learning technology to detect model parameters , The detection speed is slow 、 More resource overhead , Further performance improvements are needed ; Data leakage prevention based on model pruning , For complex tasks, the defense effect of complex models is insufficient , We need to design new algorithms or mechanisms to improve the defense effect .

My personal feeling , The application scenario of this article is for images , Mainly for image data sets , It may be because image information is easier to reverse restore . I think we can change some application scenarios , Design a text-based , Other attacks and defense means such as input data , Design some regularization functions for other application scenarios , Data augmentation function , To attack . Or propose defense measures for other application scenarios .

原网站

版权声明
本文为[zzuls]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/209/202207280518354446.html