当前位置:网站首页>15+ urban road element segmentation application, this segmentation model is enough!
15+ urban road element segmentation application, this segmentation model is enough!
2022-06-21 11:11:00 【Paddlepaddle】

Image semantic segmentation is a classical and challenging task in computer vision . It aims to provide detailed pixel level image classification , It is equivalent to assigning semantic labels to each pixel . This technology is now widely used in urban security 、 System fields such as road condition judgment , For example, the application of map navigation is to recognize buildings by segmentation 、 wall 、 Road elements such as pavement conditions , So as to capture the key information of pavement more accurately .
In order to let everyone get started more quickly , Baidu visual technology department is based on the flying paddle Image segmentation development kit PaddleSeg It provides a complete set of practical examples of Urban Street View Road element segmentation industry , Provides data preparation from 、 The whole process scheme of model training and optimization , Lower the threshold of industrial landing . In this project, we need to put 19 Split the key objectives , So what is our specific plan idea ?

chart 1 Segmentation example
Click to read the original text GET Project links
https://aistudio.baidu.com/aistudio/projectdetail/4038141?contributionType=1
All source code and tutorials have been open source , Welcome to use .
Project difficulties
The goal is complex
The road is complicated : Include straight line , Turn a corner , Traffic lights, intersections, etc ;
The environment is complex : Adapt to the day 、 Night 、 Foggy and rainy days, etc ;
Scene is complicated : Urban Rd 、 rural 、 Expressway and other scenes are quite different .
The sample is unbalanced
There are many categories : Including pavement 、 Sidewalk 、 building 、 wall 、 fence 、 Pole 、 traffic lights 、 traffic sign 、 Vegetation 、 ground 、 sky 、 people 、 Cyclists 、 vehicle 、 truck 、 The bus 、 train 、 The motorcycle 、 Bicycle ;
unbalanced : At most... Will appear in each image 15 Cars and 30 A pedestrian , Sometimes there will be 2 Cars and no pedestrians , And various degrees of occlusion and truncation .
Model selection
The mainstream semantic segmentation schemes include the following series :
FCN(Fully Convolution Network): Total convolution network , As a precedent of using deep learning for image segmentation , Its symbolic significance is greater than its practical significance .
U-Net series : stay UNet Before , The main partition networks are straight - barrel , Only the top-level or later layers of information are used for up sampling reconstruction . and UNet Is a convolution layer directly connected to the input .
DeepLab series :DeepLab In the field of image segmentation is another series , There are already several versions , And before UNet Compared with the series , The main difference is in the processing of the input image and the structure of the network .DeepLab The image pyramid is mainly used 、 Cavity convolution 、SPP Space Pyramid pooling 、 Can separate convolution and other methods to improve the effect of segmentation .
HRNet series :HRNet yes 2019 A new neural network was proposed by Microsoft Research Asia in , Different from the previous convolutional neural network , The network can still maintain high resolution in the deep layer of the network , Therefore, the predicted semantic information is more accurate , It is also more accurate in space .
Transformer series : since Transformer Since it was introduced into computer vision , It gave birth to a large number of related research and applications . In the direction of image segmentation , Emerged like SETR、TransUNet、SegFormer、MaskFormer Based on Transformer Semantic segmentation network model . It breaks the restriction of convolution structure on the access of image global information .
Because the segmentation target is complex , We selected the one with better accuracy HRNet In the series MscaleOCRNet Follow up experiments on the model , it SOTA Of mIoU Reached 87%. Compared with HRNet Network structure , It calculates a relation weight between each pixel and other pixels of the image on the result of segmentation , A superposition with the original feature constitutes OCRNet The Internet , Then based on the OCRNet Carry out layered and multi-scale training to form the final MscaleOCRNet, The multi-scale training and reasoning method is shown in the figure below .

chart 2 MscaleOCRNet programme
Algorithm optimization
In order to further improve the accuracy , Solve the problem of sample imbalance , We provide the following optimization ideas :
Modify the pre training model : take mapillary Pre training changed to Cityscapes Pre training model , Migrate to KITTI-STEP Data set training can effectively improve the segmentation effect ;
Add multi-scale training : from [0.5,1.0] The two scales are increased to [0.5,1.0,2.0] Three scales ;
Modify the input size : Modify the input dimension by 1024x512 Change to original drawing size 1248x384.

Using tools
This project uses PaddleSeg Development complete .PaddleSeg It's based on the oars PaddlePaddle Developed an end-to-end image segmentation development kit , It covers a large number of high-quality segmentation models in different directions such as high precision and lightweight . Through modular design , Provides configurable drivers and API Call two applications , Help developers more easily complete the whole process of image segmentation applications from training to deployment . Provide semantic segmentation 、 Interactive segmentation 、 Panoramic segmentation 、Matting Four image segmentation capabilities .
Model deployment
Use the propeller native inference Library Paddle Inference, Used for server-side model deployment , In general, it is divided into three steps :
1. establish PaddlePredictor, Set the exported model path ;
2. Create a for input PaddleTensor, The incoming to PaddlePredictor in ;
3. Get output PaddleTensor, Take out the results .

If you want to know more details , Welcome to our live course , The whole process of teaching is waiting for you .
Wonderful course preview
In order to make the kids more convenient to use the example tutorial , Baidu senior R & D Engineer will On 6 month 23 Japan ( Thursday )20:00 spot Prepare data for in-depth analysis 、 The whole development process from scheme design to model optimization deployment , Hand in hand to teach you code practice .
Scan the code to sign up for the live class
Join the technology exchange group

References : chart 2 Quote from “Hierarchical Multi-Scale Attention for Semantic Segmentation”

Focus on 【 Flying propeller PaddlePaddle】 official account
Get more technical content ~
边栏推荐
猜你喜欢

实测:云RDS MySQL性能是自建的1.6倍

New year's Eve, are you still changing the bug?

STL summary

C语言初阶(九)枚举

Software architecture discussion

The bilingual live broadcast of Oriental selection is popular, and the transformation of New Oriental is beginning to take shape

从零走进软件开发的世界

map.values()转为List和ArrayList的复制

根据模糊查询JanCode输入顺序将查询结果排序

03. Redis actual battle: meeting goddess nearby by geo type
随机推荐
Regression analysis - basic content
Matplotlib two methods of drawing torus!
高性能并行编程与优化 | 第01讲回家作业
送分题,ArrayList 的扩容机制了解吗?
《Feature-metric Loss for Self-supervised Learning of Depth and Egomotion》论文笔记
05. Redis core chapter: the secret that can only be broken quickly
根据模糊查询JanCode输入顺序将查询结果排序
Queue队列的实现
The first question of leetcode -- sum of two numbers
Application configuration management, basic principle analysis
Port occupancy
How to learn function test? Ali engineer teaches 4 steps
The bilingual live broadcast of Oriental selection is popular, and the transformation of New Oriental is beginning to take shape
flink cdc 读mysql 读出来的时间晚了8小时 设置serverTimeZone 这个参数
详解连接池参数设置(边调边看)
Solon 1.8.3 release, cloud native microservice development framework
618掘金数字藏品?Burberry等奢侈品牌鏖战元宇宙
postgresql 按日期范围查询
Software architecture discussion
Fastapi web framework [pydantic]