当前位置:网站首页>Ocr-gan [anomaly detection: Reconstruction Based]
Ocr-gan [anomaly detection: Reconstruction Based]
2022-07-28 22:43:00 【It's too simple】
Preface
The blog is from 2022.6CVPR A paper on , the papers with code Website statistics , The paper code is MVTec The running results on the dataset rank No 6( As of the time of posting ).
background
thought : Reconstruction based approach , The lost and increased information of reconstruction is extracted by subtracting images with different frequencies ,Unet The network further extracts feature information , The intermediate attention mechanism should do a good job in information interaction between images and adaptive channel selection , Then the loss function is constructed through the ordinary convolution network to update the network and set the evaluation score . The model is to transform the image into a feature map in different dimensions through reconstruction , And through the Unet And attention mechanism to fully mix different feature map information .
The frequency domain mentioned in the paper / Frequency band and so on are just the same picture FD Pictures observed from different angles after module processing . In the module
The deeper the partial processing , That is, the process from high frequency to low frequency , The more texture information is lost , The lost texture information is saved by image subtraction , That is, the pictures with different frequencies mentioned in the article . The higher the frequency is. , The more texture information the image contains , The lower the , The more spatial semantic information it contains .FD The module will be introduced below .
Model principle ( Source code )

The source code only reproduces the picture FD become 2 The idea of this picture , The figure above shows the idea of expansion ,FD become n A picture , Pay attention to screening .(ps: When you look at the code, you only find FD After 2 A picture , Purring ).
Let's understand , The titles correspond to the three blocks of the model ( Three groups of capital letters in the figure ), The source code path corresponding to each block is also sorted out .
FD
train_all.py-->def train()-->data-->train_ds-->def FD(img)
Reset the image size of the input model to (256,256), obtain
. Through Gaussian transformation , Discard even lines , Down sampling is achieved after even columns , Add next to each pixel 0 Value line ,0 Value column , The image is resized by Gaussian transformation (256,256) Realize up sampling , obtain
,
and
subtracting , obtain
.
yes
, such
and
Input to the next module together .
CS
train_all.py-->def train()-->model-->ocr_gan_aug-->self.netg-->UnetGenerator_CS-->unet_block
Unet The construction of the network ( too amzing 了 !)
Because it is an imitative structure , So the setting of convolution block is not all unet The style of .

The main program (UnetGenerator_CS): see Unet The network structure diagram is as above , The first line is the first unet block 1, There are five elements in total unet block (1~5). The construction of the program is to unet The block starts from bottom to top according to the figure , Last use self.model finishing .
Subroutines (unet_block): see forward, The model runs , from unet_block In the parameters of the submodel=unet, Run the blocks in the main program upside down , Restore to the running sequence of the structure diagram .( A two-step , Because every time I run to submodel Will jump to the next unet block , This will make everyone unet The block only runs part , When running to the last unet When a block , Continue to jump back until the first unet block . First step : Five structural blocks are convoluted twice to generate five feature layer blocks , The second step : When you jump to the fourth time, you encounter the lowest block in the structure diagram 5, After one convolution, the feature layer block returns to the block 4, And block 4 Feature layer block fusion at , After one convolution, the block is returned 3, Repeat to block 1. Get the final feature map ).
CS block
After the average pool of the whole picture , A feature layer of the image becomes a value , Reduce the number of channels through another full connection , Number of full connection recovery channels at a time , Join in softmax() Function assigns probability to each pixel , Multiply , Complete module functions .
CS The block is responsible for the interaction between frequency diagrams ( This part is the pixel value of a feature layer × The probability value of another characteristic layer ), And add attention mechanisms , Make adaptive selection for different channels .
summary : the CS Handle , The two feature maps are added to form 
D
The source code path of this part is very close to the above part , According to the structure in the figure , It's just a few convolution blocks .
As a discriminator , Output loss function , Help the model better distinguish abnormal images , because GAN The network cannot guarantee that the abnormal image cannot be completely reconstructed .
Loss function
Content Loss namely nn.L1Loss、Adversarial Loss namely nn.BCELosspytorch Loss function (nn.L1Loss、nn.SmoothL1Loss、nn.MSELoss 、nn.CrossEntropyLoss、nn.NLLLoss)_silentkunden The blog of -CSDN Blog _l1loss
Latent Loss namely L2Loss
experiment
Comparative experiments : Respectively in MVTec、DGAM、KolektorSDD Evaluate the running results of different models on the data set .
Ablation Experiment :FD How many pictures 、FD block 、CS block 、 Finally, the different use types of data enhancement are taken as the preconditions .
Add
The paper mentioned CutPast( Self supervised / Based on classification )
Reference resources
The code base segment involved in the source code : Common library code snippets pytorch_based【tips】_ Too simple blog -CSDN Blog
Unet The Internet :Unet【 Basic network 】_ Too simple blog -CSDN Blog
边栏推荐
- Ultra detailed visual studio 2019 running littlevgl (lvgl) simulator
- fatal error: io. h: No such file or directory
- Solve Jupiter: the term 'Jupiter' is not recognized as the name of a cmdlet, function, script file
- 771. 字符串中最长的连续出现的字符
- How to use sprintf function
- NPM switch Taobao source (NPM source)
- 775. 倒排单词
- Solve various problems of sudo rosdep init and rosdep update
- STM32 - Basic timer (tim6, tim7) working process, interpretation function block diagram, timing analysis, cycle calculation
- MySQL installation and configuration (super detailed, simple and practical)
猜你喜欢

Which is the file transfer command in the basic services of the Internet
![[connect your mobile phone wirelessly] - debug your mobile device wirelessly via LAN](/img/7f/c49fd8c2cbe21585a080852833dcb4.png)
[connect your mobile phone wirelessly] - debug your mobile device wirelessly via LAN
![Differernet [anomaly detection: normalizing flow]](/img/75/958d753c20227fbbfe1085e7d6ce6f.png)
Differernet [anomaly detection: normalizing flow]

The function of wechat applet to cut pictures

使用PCL批量将点云.bin文件转.pcd
![[get mobile information] - get mobile information through ADB command](/img/ad/b10c5d09a21fb0cb22aa8a002fbd99.png)
[get mobile information] - get mobile information through ADB command

Imx6q GPIO multiplexing

XXX port is already in use
![MKD [anomaly detection: knowledge disruption]](/img/15/10f5c8d6851e94dac764517c488dbc.png)
MKD [anomaly detection: knowledge disruption]
![Memseg [anomaly detection: embedded based]](/img/10/aea2b6ecf55e04fe24f78e5fb897be.png)
Memseg [anomaly detection: embedded based]
随机推荐
[virtual machine _2]-hyper-v and vmware/virtualbox cannot coexist
JMeter installs third-party plug-ins plugins Manager
How to use sprintf function
PC side web page effects (client series, scroll series, immediate function execution, sidebar effects)
96. Different binary search trees (medium binary search tree dynamic planning)
JS implementation generates a random key of specified length
STM32 - Basic timer (tim6, tim7) working process, interpretation function block diagram, timing analysis, cycle calculation
SSH password free login
log4j漏洞 elk平台 处理方法 (logstah5.5.1)
776. 字符串移位包含问题
Paddlenlp is based on ernir3.0 text classification. Take the crime prediction task of cail2018-small dataset as an example [multiple tags]
776. String shift inclusion problem
近期bug总结
[CVPR 2021] cylinder3d: cylindrical asymmetric 3D convolution network for LIDAR point cloud segmentation
CMD common commands
使用PCL批量将点云.bin文件转.pcd
Summary of common error types in JS
二进制的原码、反码、补码
PaddleNLP基于ERNIR3.0文本分类以中医疗搜索检索词意图分类(KUAKE-QIC)为例【多分类(单标签)】
The function of wechat applet to cut pictures