当前位置:网站首页>Semantic segmentation | learning record (1) semantic segmentation Preface

Semantic segmentation | learning record (1) semantic segmentation Preface

2022-07-08 02:09:00 coder_ sure

Semantic segmentation | Learning record (1) Semantic segmentation Preface

Tips : come from up Lord thunderbolt Wz, I'm just taking study notes , Original video


Preface

The preface of semantic segmentation mainly introduces the content involved in this paper :

  • What is semantic segmentation
  • Tentative learning objectives
  • Common dataset formats for semantic segmentation tasks
  • The specific form of the result obtained by semantic segmentation
  • Common evaluation indicators for semantic segmentation
  • Semantic segmentation annotation tool

One 、 What is semantic segmentation ?

Semantic segmentation is one of the common segmentation tasks , Common segmentation tasks have the following three aspects :

  • Semantic segmentation (semantic segmentation)FCN
  • Instance segmentation (Instance segmentation)Mask R-CNN
  • Panoramic segmentation (Panoramic segmentation) Panoptic FPN

 Semantic segmentation
 Instance segmentation
Panoramic segmentation is not only to distinguish the background and foreground , Moreover, the background should be classified and segmented in some columns .
The difficulty of the above three segmentation tasks increases in turn .

Two 、 Learning Planning

Several semantic segmentation algorithm source code introduction
 Learning Planning

Two 、 Common dataset formats for semantic segmentation tasks

1.PASCAL VOC

PASCAL VOC Dataset format
PASCAL VOC What is provided in semantic segmentation is actually a PNG picture , In this PNG The file records the category of each pixel , there PNG Pictures are stored in palette format ( The original picture is a 1 Grayscale image of the channel ), The corresponding pixel value is mapped to the corresponding color value . such as :

  • Pixels 0 The corresponding is (0,0,0) black
  • Pixels 1 The corresponding is (127,0,0) Deep red
  • Pixels 255 The corresponding is (224,224,129)
    This 255 It's necessary to explain : When we calculate the loss, we will ignore that the pixel value is 255 These pixels , Because it's hard to say which category the edge of the target strictly belongs to , Including some goals that are not easy to divide , We also have 255 Fill in . such as , The figure above has a quadrilateral , It's actually the tail of an airplane , This segmentation is very difficult , We just ignore it .
     prospects , The pixel value corresponding to the background and edge

2.MS COCO

MS COCO Dataset format
The feature is that each target is given a polygon , And record the coordinates of each corner of the polygon .
MS COCO Data set introduction and pycocotools Easy to use

3、 ... and 、 The specific form of the result obtained by semantic segmentation

 The specific form of semantic segmentation results
Why not directly display grayscale images , But to turn it into color ?
for instance The plane is a pixel value corresponding to 1, Person correspondence is 15, The difference between them is very big , If in the form of gray , It's hard for us to see the difference .
So we map the pixel value to the color format , also Each pixel value corresponds to the category index .

Four 、 Semantic segmentation evaluation index

Pixel Accuracy(Global Acc): pre measuring just indeed Of image plain individual Count total Of image plain individual Count \frac { Predict the correct number of pixels }{ Total number of pixels } total Of image plain individual Count pre measuring just indeed Of image plain individual Count
Σ i n i i Σ i t i \frac{\Sigma_{i}n_{ii}}{\Sigma_{i}t_{i}} ΣitiΣinii
mean Accuracy: Average the accuracy of each category of pixels
1 n c l s Σ i n i i t i \frac{1}{n_{cls}}\Sigma_{i}\frac{n_{ii}}{t_{i}} ncls1Σitinii
mean IoU: Yes IoU averaging
1 n c l s Σ i n i i t i + Σ j n j i − n i i \frac{1}{n_{cls}}\Sigma_{i}\frac{n_{ii}}{t_{i}+\Sigma_{j}n_{ji}-n_{ii}} ncls1Σiti+Σjnjiniinii
Yan color heavy Stack District Domain Of Noodles product total Noodles product \frac { Area of color overlapping area }{ Total area } total Noodles product Yan color heavy Stack District Domain Of Noodles product
 Please add a picture description

among :

  • n i j n_{ij} nij: Category i Predicted into categories j The number of pixels
  • n c l s n_{cls} ncls: Number of target categories ( Include background )
  • t i = Σ j n i j : t_{i}=\Sigma_{j}n_{ij}: ti=Σjnij: Target categories i Total number of pixels ( Real label )

Have a deep understanding of this evaluation index

 Please add a picture description
 Please add a picture description
 Please add a picture description
 Please add a picture description
 Please add a picture description
 Please add a picture description
 Please add a picture description
 Please add a picture description
mean acc   = 1 5 ∑ ( c l a s s i a c c ) \ =\frac{1}{5}\sum(class_iacc)  =51(classiacc)

 Please add a picture description
mean IoU   = 1 5 ∑ ( c l s i i o u ) \ =\frac{1}{5}\sum(cls_i iou)  =51(clsiiou)

5、 ... and 、 Semantic segmentation annotation tool

Traditional annotation tools , such as :Labelme
 Please add a picture description
Labelme

A semiautomatic annotation tool : Baidu EISeg Please add a picture description
EISeg

6、 ... and 、 Reference material

PASCAL VOC2012 Data set introduction
EISeg Segmentation and annotation software use

原网站

版权声明
本文为[coder_ sure]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202130541040378.html