当前位置:网站首页>Semantic segmentation | learning record (1) semantic segmentation Preface
Semantic segmentation | learning record (1) semantic segmentation Preface
2022-07-08 02:09:00 【coder_ sure】
Semantic segmentation | Learning record (1) Semantic segmentation Preface
Tips : come from up Lord thunderbolt Wz, I'm just taking study notes , Original video
List of articles
- Semantic segmentation | Learning record (1) Semantic segmentation Preface
- Preface
- One 、 What is semantic segmentation ?
- Two 、 Learning Planning
- Two 、 Common dataset formats for semantic segmentation tasks
- 3、 ... and 、 The specific form of the result obtained by semantic segmentation
- Four 、 Semantic segmentation evaluation index
- 5、 ... and 、 Semantic segmentation annotation tool
- 6、 ... and 、 Reference material
Preface
The preface of semantic segmentation mainly introduces the content involved in this paper :
- What is semantic segmentation
- Tentative learning objectives
- Common dataset formats for semantic segmentation tasks
- The specific form of the result obtained by semantic segmentation
- Common evaluation indicators for semantic segmentation
- Semantic segmentation annotation tool
One 、 What is semantic segmentation ?
Semantic segmentation is one of the common segmentation tasks , Common segmentation tasks have the following three aspects :
- Semantic segmentation (semantic segmentation)FCN
- Instance segmentation (Instance segmentation)Mask R-CNN
- Panoramic segmentation (Panoramic segmentation) Panoptic FPN
Panoramic segmentation is not only to distinguish the background and foreground , Moreover, the background should be classified and segmented in some columns .
The difficulty of the above three segmentation tasks increases in turn .
Two 、 Learning Planning
Several semantic segmentation algorithm source code introduction
Two 、 Common dataset formats for semantic segmentation tasks
1.PASCAL VOC
PASCAL VOC What is provided in semantic segmentation is actually a PNG picture , In this PNG The file records the category of each pixel , there PNG Pictures are stored in palette format ( The original picture is a 1 Grayscale image of the channel ), The corresponding pixel value is mapped to the corresponding color value . such as :
- Pixels 0 The corresponding is (0,0,0) black
- Pixels 1 The corresponding is (127,0,0) Deep red
- Pixels 255 The corresponding is (224,224,129)
This 255 It's necessary to explain : When we calculate the loss, we will ignore that the pixel value is 255 These pixels , Because it's hard to say which category the edge of the target strictly belongs to , Including some goals that are not easy to divide , We also have 255 Fill in . such as , The figure above has a quadrilateral , It's actually the tail of an airplane , This segmentation is very difficult , We just ignore it .
2.MS COCO
The feature is that each target is given a polygon , And record the coordinates of each corner of the polygon .
MS COCO Data set introduction and pycocotools Easy to use
3、 ... and 、 The specific form of the result obtained by semantic segmentation
Why not directly display grayscale images , But to turn it into color ?
for instance The plane is a pixel value corresponding to 1
, Person correspondence is 15
, The difference between them is very big , If in the form of gray , It's hard for us to see the difference .
So we map the pixel value to the color format , also Each pixel value corresponds to the category index
.
Four 、 Semantic segmentation evaluation index
Pixel Accuracy(Global Acc): pre measuring just indeed Of image plain individual Count total Of image plain individual Count \frac { Predict the correct number of pixels }{ Total number of pixels } total Of image plain individual Count pre measuring just indeed Of image plain individual Count
Σ i n i i Σ i t i \frac{\Sigma_{i}n_{ii}}{\Sigma_{i}t_{i}} ΣitiΣinii
mean Accuracy: Average the accuracy of each category of pixels
1 n c l s Σ i n i i t i \frac{1}{n_{cls}}\Sigma_{i}\frac{n_{ii}}{t_{i}} ncls1Σitinii
mean IoU: Yes IoU averaging
1 n c l s Σ i n i i t i + Σ j n j i − n i i \frac{1}{n_{cls}}\Sigma_{i}\frac{n_{ii}}{t_{i}+\Sigma_{j}n_{ji}-n_{ii}} ncls1Σiti+Σjnji−niinii
Yan color heavy Stack District Domain Of Noodles product total Noodles product \frac { Area of color overlapping area }{ Total area } total Noodles product Yan color heavy Stack District Domain Of Noodles product
among :
- n i j n_{ij} nij: Category i Predicted into categories j The number of pixels
- n c l s n_{cls} ncls: Number of target categories ( Include background )
- t i = Σ j n i j : t_{i}=\Sigma_{j}n_{ij}: ti=Σjnij: Target categories i Total number of pixels ( Real label )
Have a deep understanding of this evaluation index
mean acc = 1 5 ∑ ( c l a s s i a c c ) \ =\frac{1}{5}\sum(class_iacc) =51∑(classiacc)
mean IoU = 1 5 ∑ ( c l s i i o u ) \ =\frac{1}{5}\sum(cls_i iou) =51∑(clsiiou)
5、 ... and 、 Semantic segmentation annotation tool
Traditional annotation tools , such as :Labelme
Labelme
A semiautomatic annotation tool : Baidu EISeg
EISeg
6、 ... and 、 Reference material
PASCAL VOC2012 Data set introduction
EISeg Segmentation and annotation software use
边栏推荐
- 微软 AD 超基础入门
- Clickhouse principle analysis and application practice "reading notes (8)
- Node JS maintains a long connection
- [reinforcement learning medical] deep reinforcement learning for clinical decision support: a brief overview
- JVM memory and garbage collection-3-direct memory
- JVM memory and garbage collection -4-string
- 咋吃都不胖的朋友,Nature告诉你原因:是基因突变了
- CV2 read video - and save image or video
- 关于TXE和TC标志位的小知识
- Keras深度学习实战——基于Inception v3实现性别分类
猜你喜欢
What are the types of system tests? Let me introduce them to you
metasploit
力争做到国内赛事应办尽办,国家体育总局明确安全有序恢复线下体育赛事
保姆级教程:Azkaban执行jar包(带测试样例及结果)
leetcode 873. Length of Longest Fibonacci Subsequence | 873. 最长的斐波那契子序列的长度
Talk about the realization of authority control and transaction record function of SAP system
leetcode 869. Reordered Power of 2 | 869. Reorder to a power of 2 (state compression)
日志特征选择汇总(基于天池比赛)
The function of carbon brush slip ring in generator
How to fix the slip ring
随机推荐
VR/AR 的产业发展与技术实现
The circuit is shown in the figure, r1=2k Ω, r2=2k Ω, r3=4k Ω, rf=4k Ω. Find the expression of the relationship between output and input.
JVM memory and garbage collection-3-object instantiation and memory layout
cv2读取视频-并保存图像或视频
微信小程序uniapp页面无法跳转:“navigateTo:fail can not navigateTo a tabbar page“
如何用Diffusion models做interpolation插值任务?——原理解析和代码实战
Wechat applet uniapp page cannot jump: "navigateto:fail can not navigateto a tabbar page“
#797div3 A---C
Many friends don't know the underlying principle of ORM framework very well. No, glacier will take you 10 minutes to hand roll a minimalist ORM framework (collect it quickly)
COMSOL --- construction of micro resistance beam model --- final temperature distribution and deformation --- addition of materials
电路如图,R1=2kΩ,R2=2kΩ,R3=4kΩ,Rf=4kΩ。求输出与输入关系表达式。
How to fix the slip ring
I don't know. The real interest rate of Huabai installment is so high
日志特征选择汇总(基于天池比赛)
Le chemin du poisson et des crevettes
cv2-drawline
Analysis ideas after discovering that the on duty equipment is attacked
VIM use
leetcode 873. Length of Longest Fibonacci Subsequence | 873. 最长的斐波那契子序列的长度
Keras深度学习实战——基于Inception v3实现性别分类