2022-07-07 05:33:00 xiongxyowo

[ Address of thesis ] [ Code ] [MICCAI 21]


Semi supervised learning has attracted great attention in the field of machine learning , Especially for the task of medical image segmentation , Because it reduces the heavy burden of collecting a large number of dense annotation data for training . However , Most existing methods underestimate challenging areas during training ( Such as small branches or fuzzy edges ) Importance . We think , These unmarked areas may contain more critical information , To minimize the uncertainty of the model prediction , And should be emphasized in the training process . therefore , In this paper , We propose a new mutual consistent network (MC-Net), For from 3D MR Semi supervised segmentation of left atrium in image . especially , our MC-Net It consists of an encoder and two slightly different decoders , The prediction difference between the two decoders is transformed into unsupervised loss by our cyclic pseudo tag scheme , To encourage mutual consistency . This mutual consistency encourages the two decoders to have consistent 、 Low entropy prediction , And enable the model to gradually capture generalization features from these unmarked challenging areas . We are in the public left atrium (LA) Our MC-Net, It has achieved impressive performance improvements by effectively utilizing unlabeled data . our MC-Net In terms of left atrial segmentation, it is superior to the recent six semi supervised methods , And in LA New and most advanced performance has been created in the database .


The general idea of this paper is to design a better pseudo tag to improve the semi supervised performance , The process is as follows :
 Insert picture description here
The first is how to measure uncertainty (uncertainty) The problem of . This paper believes that popular methods such as MC-Dropout You need to reason many times during training , It will bring extra time overhead , So here is a " Space for time " The way , That is, an auxiliary decoder is designed D B D_B DB, The decoder is structurally " Very simple ", It is directly multiple up sampling interpolation to obtain the final result . And the original decoder D A D_A DA Then with V − N e t V-Net VNet bring into correspondence with .

It's like this , Without introducing large network parameters ( Because the structure of the auxiliary decoder is too simple ), The model can obtain two different results in the case of one reasoning , Obviously, the result of the auxiliary decoder will be " Worse "( This can also be seen from the picture ). In the final calculation of uncertainty, we only need to compare the differences between the two results .

Although this approach seems very simple , But it's amazing to think about it ; One is strong and the other is weak , If the sample is simple , So weak classification header can also get a better result , At this time, the difference between the two results is small , The degree of uncertainty is low . For some samples with large amount of information , The result of weak classification header is poor , At this time, there is a big difference between the two results , The uncertainty is higher .

And for the two results obtained , First, use a sharpening function to deal with it , To eliminate some potential noise in the prediction results . The sharpening function is defined as follows : s P L = P 1 / T P 1 / T + ( 1 − P ) 1 / T s P L=\frac{P^{1 / T}}{P^{1 / T}+(1-P)^{1 / T}} sPL=P1/T+(1P)1/TP1/T When using false label supervision , Then use B To monitor A, Use A To monitor the results B. In this way, the strong decoder D A D_A DA Can learn the invariant features in the weak encoder to reduce over fitting , Weak encoder D B D_B DB You can also learn strong encoder D A D_A DA Advanced features in .

