当前位置:网站首页>Analysis and application of portrait segmentation technology
Analysis and application of portrait segmentation technology
2022-08-04 13:33:00 【ZEGO Instant Technology】
一、Cutout profile
Cutout mainly solve the foreground and background areas in an image accurately forecast problem,For image editing and film and television production and its important practical significance to.So accurate effective cutout algorithm can greatly improve the working process of the production,To save a lot of manpower material resources to improve the work efficiency.
Cutout as a basic computer vision problems,Applied in many areas there are a lot of.
And about the cutout is usually based on the traditional way of trimap(Three value figure)对前景、背景和 alpha 进行估计,But when the image in the foreground and background colors similar to or a more complex texture,Traditional algorithm it is difficult to obtain good effect.So the limitations of the traditional method mainly depends on the reference are low-level color、Texture and structure characteristics of the lack of high-level semantic information.
With the great development of deep learning and technology in recent years,Its powerful advanced feature extraction ability to solve the disadvantages of the traditional technology and.ZEGO The structure of science and technology using deep learning technology to solve image、Video cutout task and widely used in multiple business scenarios.This article we will guide you know about cutout in the direction of the portrait segmentation technology realization and application scenario!
二、Cutout technology principle of analytic
Cutout is essentially a fine segmentation,So the focus is onHow to obtain high quality transparency mask(alpha 图)上.
For the cutout,Can be summarized by the following formula:
其中Ri表示最终的结果,AiSaid cutout needed transparency mask,BiSaid need to replace the new background of. 在Ai中,Future location of the value is greater than 0 的,Location and background value is equal to the 0.
如公式所示,The cutout algorithm know each pixel of the difficulty lies in RGB Value but to prospect RGB、背景的 RGB 以及透明度 ALPHA 等 7 Estimates the unknown.
Most of the traditional algorithm by trimap(手工绘制)的方式来生成 alpha 图.trimap 中包含了 3 Different kinds of pixel values,The pixel values as 0 The location of the said is to determine the background,像素值为 1 The location of the said is to determine the prospect of,The pixel values as 0.5 The location of the said is unknown,This part of the position may be both foreground may also be a background,The cutout algorithm to solve is in unknown area by random walking、knn、closed form And so on methods to solve the foreground and background
trimap生成alpha图
trimap Knowledge of drawing requires the user to have certain experience,Therefore do not have universality.同时 trimap Due to the human-computer interaction,So also does not have real time.
ZEGO The construction technology in order to solve the above problem,With the method of deep learning research and development of cutout algorithm.The algorithm overall use encoder、decoder 结构,Simply enter a stay cutout image will be able to produce the final alpha 图.
编码解码器结构
This structure can be extracted to encode the input image compression and the depth of features,Finally through decoders decoding to fitting the realalpha 图样本.我们的 encoder 采用轻量级的 mobilenetV3_small 架构,To perform real time calculation on edge equipment.
Data for deep learning is very important.We use public data sets, and a large number of network built a picture 40 All the size of cutout data set,其中的 alpha Figure all through photoshop Professional software such as manual for.The data set includes all kinds of daily situations bust、全身人像,Like a single or multiple targets, as well as various posture under the portrait.
In order to solve the flicker and error problem in video cutout,We joined the time series in the network structure of information,Next frame processing will be the reference on the result of a frame and modification.
t时刻 t+1时刻 t+2时刻
Timing information on error correction
At the same time in order to be able to real-time reasoning in end side equipment and maintain a good effect,We will network structure is divided into two branches,A branch which contains only a small amount of convolution operator for high resolution input feature extraction,Another branch under the original input sampling after,利用 encoder-decoder Structure of low resolution information compression and extraction,The final output of two branches and produce the final fusion results.
整体算法流程
In order to avoid over fitting and get a better generalization performance,Training strategy we adopt segmentation data and cutout compound training way.When each odd iteration,使用 COCO、YouTubeVIS 2021、Supervisely Person DatasetSuch as division of public data sets to train,The rest of the iteration to use their own large-scale data set for training.More updates using cosine annealing to restart,Avoid network into a local optimal point and speed up the training.
According to different platform we design different scale models for different cutout task.Offline processing or the server we use bigger model to obtain the result of more sophisticated mobile end little model is used to obtain the reasoning speed and accuracy of balance.
ZEGOSmall model cutout algorithm details show
从上图可以看出,By comparison with cutout and bad people,Which compose the algorithm under the treatment of fine hair these scenarios to preserve more details.
三、As segmentation scenario application
1、证件照背景替换
日常生活中,We often need to use all kinds of certificates,For example, red soles、蓝底、White etc.,However, according to the need of different documents to studio shooting many times need certain economic and time cost.Of course there are many professional image processing software,But for ordinary people,Professional software learning there are still some threshold.
ZEGO 即构科技的 AI Intelligent documents according to the cutout algorithm without user with the use of professional software knowledge,According to face point detection automatically to the original image was head and shoulders like a cut,Then using the cutout algorithm to complete any color background to replace.Our algorithm has the characteristics of lightweight,The entire model file only 6MB,在 CPU Environment a certificate according to the cutout need only100ms 的时间.
2、Online art test background blur
In some online test scenarios such as dance、声乐、Broadcasting arts test scenarios such as in the background,Usually occurs some banners、Advertising or other has nothing to do with the examination of information,Especially in the dance and vocal music online exam needs to pay special attention to this problem,To maintain the integrity of the picture and stereo feeling,So that you can pay more attention to the judges in the exam content itself,This requires some portrait of the cutout algorithm,However, the general algorithm is very easy to、Dancing props、舞蹈服(Mainly some clothes)给抹掉,Therefore directly replace the background doesn't work.
ZEGO The structure of science and technology put forward the project of the background blur,The algorithm in dig a portrait at the same time,On the background image blur.The algorithm is a reference between consecutive video frames of temporal information,Flashing on the image have very good inhibitory.
The algorithm of the model file is 3MB,Have extremely lightweight characteristics,在 M1 芯片的 Mac Book Pro On a frame to deal with a need to 20ms 的时间.
3、Replace the game host background
Game anchor side share their picture on the other side open your camera and audience interaction has become the mainstream of today's game,The content of the games live is important,But the host and the audience more benign interaction can reduce the turnover rate of the audience.However the game host live place mostly at home,In private with a certain requirement.
ZEGO The structure of science and technology of cutout algorithm can quickly extract portrait,And use any picture as background,So good to protect user privacy.
After change the background to protect user privacy
四、结尾
通过上文的介绍,We learned the cutout in the direction of the portrait segmentation technology principle and common application scenario,Through the portrait division technology,可以更好的保护用户隐私,Show the key scenes.
ZEGO The structure of science and technology by deep learning technology to solve image、Video cutout task and widely used in multiple business scenarios,Especially in some fine scenarios,Such as human hair details,Also have very good effect performance!
边栏推荐
- router---Route guard
- js正则表达式提取内容
- 橄榄枝大课堂APP正式启动上线
- 用过Apifox这个API接口工具后,确实感觉postman有点鸡肋......
- rpm安装提示error: XXX: not an rpm package (or package manifest):
- Motion Regulations (18) - and check the basic questions - gang
- 错误 AttributeError type object 'Callable' has no attribute '_abc_registry' 解决方案
- 使用SQLServer复制数据库
- postgre 支持 newsql 特性可行性有多大?
- 国家安全机关对涉嫌危害国家安全犯罪嫌疑人杨智渊实施刑事拘传审查
猜你喜欢
c#之winform(软件开发)
Unity 3D模型展示框架篇之资源打包、加载、热更(Addressable Asset System | 简称AA)
ReentrantLock 原理
RK1126编译gdb 板子上gdb调试程序
Week 7 Latent Variable Models and Expectation Maximization
Interviewer: Tell me the difference between NIO and BIO
新 Nsight Graph、Nsight Aftermath 版本中的性能提升和增强功能
"Social Enterprises Conducting Civilian Personnel Training Specifications" group standard on the shelves of Xinhua Bookstore
秋招攻略秘籍,吃透25个技术栈Offer拿到手软
持续交付(二)PipeLine基本使用
随机推荐
国家安全机关对涉嫌危害国家安全犯罪嫌疑人杨智渊实施刑事拘传审查
Just a Hook
并发刺客(False Sharing)——并发程序的隐藏杀手
双目立体视觉笔记(二)
ROS设置plugin插件
持续交付(三)Jenkinsfile语法使用介绍
Launcher app prediction
Systemui qsSetting添加新图标
搭建ros交叉编译环境(从x86到nvidia arm)
到底什么是真正的HTAP?
[Niu Ke brush questions-SQL big factory interview questions] NO5. Analysis of a treasure store (e-commerce model)
人像分割技术解析与应用
Arduino框架下I2S控制ADC采样以及PWM输出示例解析
DateTimeFormatter api
正确使用Impala的invalidate metadata与refresh语句
一文梳理NLP主要模型发展脉络
MFC的相机双目标定界面设计
谁说 Mysql 单表最大 2000 W ?我硬要塞它 1 个亿
密码设置有关方法:不能相同字母,不能为连续字符
跨链桥已成行业最大安全隐患 为什么和怎么办