当前位置:网站首页>Review of the paper: unmixing based soft color segmentation for image manipulation
Review of the paper: unmixing based soft color segmentation for image manipulation
2022-06-26 03:12:00 【Researcher-Du】
Unmixing-Based Soft Color Segmentation for Image Manipulation Is an image processing paper based on soft segmentation , Published in SIGGRAPH 2017. The full text 19 page , In general , It's hard to understand , The article contains a lot of seemingly unrelated content , such as Matting, Iterative optimization, etc . I had a clear grasp of the article after referring to the realization of the paper .
Reference code :https://github.com/liuguoyou/color-unmixing
B standing video:https://www.bilibili.com/video/BV1bU4y1x7MY/?vd_source=6a91312d89cec082cf6d5a92fee7279a
By convention , Take the last one teaser. For input images (a), The algorithm automatically extracts multiple layer(b), Every layer Where region Hardly intersect ( In different region The boundaries will intersect , The so-called soft segmentation ), And each layer The color of follows a positive distribution , After extracting layers , You can adjust the color of the image , Such as (c-d) Shown .
A brief review of the classic palette based layer extraction algorithms . This kind of algorithm (Chang et al, clustering algorithm ; Tan et al, Geometric convex hull ):
1) Firstly, several representative colors of the input image are extracted as the color palette of the image : C = { C 1 , C 2 . . . C k } C = \{C_1,C_2...C_k\} C={ C1,C2...Ck}.
2) Next, the pixels in the image are interpolated , Represent each pixel as a convex combination of palettes : c p = ∑ i w i p C i c_p = \sum_iw^p_iC_i cp=∑iwipCi, And meet ∑ i w i p = 1 \sum_iw^p_i=1 ∑iwip=1.
3) Each layer of nature can be represented as : L i p = w i p C i L^p_i = w^p_iC_i Lip=wipCi, among , L i p L^p_i Lip Represent layers L i L_i Li in p p p The color of the point .
Recoloring: according to 2), We only need to calculate the interpolation weight once , Then modify the palette color to achieve recolor : c p ′ = ∑ i w i p C i ′ c'_p = \sum_iw^p_iC'_i cp′=∑iwipCi′.
The main disadvantage of palette based algorithm is that it is difficult to maintain the sparsity of interpolation . informally , such as 2) in , Probably p p p Interpolation weights for all palette colors w 1 p w^p_1 w1p All are greater than 0, In this way , When you change the color of a palette, you may change the overall color of the image , The locality of recolor operation is not good enough . Therefore, interpolation sparsity is an important problem , Sparse interpolation can achieve good local recolor . I think today's paper has achieved good sparsity to some extent , Because most pixels are associated with only one layer, not multiple , Only in region Boundary pixels are associated to multiple layers ( Be similar to matting The effect of the algorithm ).
Back to the point , Let's briefly talk about the algorithm of this paper .
One 、 Color means
The algorithm will extract multiple layers , The color of each layer conforms to the positive distribution . Similar to palette based layer decomposition algorithm , Any layer L i L_i Li contain Color and Opacity information . Pixels in the layer p p p The color and opacity of are expressed as : u i p u^p_i uip and α i p \alpha^p_i αip. therefore , The color of any pixel in the image can be mixed by multiple layers :
c p = ∑ i α i p u i p (1) c^p = \sum_i\alpha^p_iu^p_i\tag1 cp=i∑αipuip(1)
Conditions to be met 1: Guaranteed convex combination , The sum of the weights is 1: ∑ i α i p = 1 (2) \sum_i\alpha^p_i=1\tag2 i∑αip=1(2).
Conditions to be met 2: The value range of color and opacity is located in 0,1 Between : α i p , u i p ∈ [ 0 , 1 ] (3) \alpha_{i}^{p}, {u}_{i}^{p} \in[0,1]\tag3 αip,uip∈[0,1](3).
Next up , First, it introduces how to get the layer and its parameters .
Two 、 Layer decomposition
As mentioned earlier, each of the layer The color of follows a positive distribution . therefore , First of all, make sure there are several layer And estimate each layer Is a normal distribution . For the initial layer It is estimated that , This paper adopts an iterative method to automatically decide by voting layer And further estimate layer Parameters of .
1) First, put the input image in RGB The space is divided into 10 × 10 × 10 = 1000 10\times10\times10=1000 10×10×10=1000 individual bins.
2) Then calculate the gradient of the image , Can be called directly Opencv Library function cv2.Laplacian Get the gradient of all pixels .
3) Traverse all pixels , For each pixel p p p, Navigate to the bin Suppose its coordinates are b = { b r , b g , b b } b=\{b_r,b_g,b_b\} b={ br,bg,bb}, Calculate the voting weight of the pixel :
v p = e − ∥ ∇ c p ∥ ( 1 − e − r p ) (4) v^{p}=e^{-\left\|\nabla c^{p}\right\|}\left(1-e^{-r^{p}}\right)\tag4 vp=e−∥∇cp∥(1−e−rp)(4)
among , ∇ c p \nabla c^{p} ∇cp Express p p p Gradient of , r p r^{p} rp It is called in the paper representation score, Think of it as p p p The reconstruction error of , How to calculate the reconstruction error will be described later in the second part . It can be seen from the formula that :
a) Reconstruction error r p r^{p} rp The bigger it is , e − r p e^{-r^{p}} e−rp The closer the 0, thus ( 1 − e − r p ) \left(1-e^{-r^{p}}\right) (1−e−rp) The bigger it is , Lead to v p v^{p} vp The bigger it is .
b) gradient ∇ c p \nabla c^{p} ∇cp The smaller is the difference between the color of the point and the surrounding color , e − ∥ ∇ c p ∥ e^{-\left\|\nabla c^{p}\right\|} e−∥∇cp∥ The bigger it is .
Next , Accumulate the voting weight of the point to the bin: b i n s [ b r ] [ b g ] [ b b ] + = v p bins[b_r][b_g][b_b] += v^p bins[br][bg][bb]+=vp.
4) Choose the one with the greatest voting weight bin: b i n m a x = m a x ( b i n [ 0 ] [ 0 ] [ 0 ] , b i n [ 0 ] [ 0 ] [ [ 1 ] , . . . b i n [ 9 ] [ 9 ] [ 9 ] ) bin_{max} = max(bin[0][0][0],bin[0][0][[1],...bin[9][9][9]) binmax=max(bin[0][0][0],bin[0][0][[1],...bin[9][9][9])
5) stay b i n m a x bin_{max} binmax Select the seed point . Traverse b i n m a x bin_{max} binmax All pixels in , For any pixel p p p, Count it 20 × 20 20 \times 20 20×20 How many pixels in the neighborhood of also fall on b i n m a x bin_{max} binmax in , Write it down as S p S^p Sp. Last , Calculate again p p p Point score :
s c o r e p = S p e − ∥ ∇ c p ∥ (5) score_p =S^pe^{-\left\|\nabla c^{p}\right\|} \tag5 scorep=Spe−∥∇cp∥(5)
elect b i n m a x bin_{max} binmax The pixel with the highest score is used as the seed point :
s i = arg max p ∈ bin S p s c o r e p (6) s_{i}=\underset{p \in \text { bin }}{\arg \max } \mathcal{S}^{p} score_p\tag6 si=p∈ bin argmaxSpscorep(6)
6) Point to the seed s i s_i si Where 20 × 20 20 \times 20 20×20 The neighborhood performs operations similar to Gaussian filtering , Calculate the filter weight of each pixel in the neighborhood , And let the power return to one .
7) Point to the seed s i s_i si Where 20 × 20 20 \times 20 20×20 Neighborhood , Estimate the parameters of the positive distribution : Mean and covariance matrices . So we got the first one layer Corresponding normal distribution .
8) repeat 1)~ 7), Cycle stop condition : If the vast majority (99.5%) The reconstruction error of pixels has been very small : r p < τ 2 r^p < \tau^2 rp<τ2, ( τ = 5 \tau=5 τ=5), Then the algorithm stops .
The process diagram of iteratively selecting seed points in the previous paper , The person in the picture is the author of the paper !
3、 ... and 、Representation Score
Input : n n n individual layer The corresponding positive distribution .
Output : Calculate the of each pixel representation score.
For each pixel p p p, Its representation score By minimizing the following energy function :
F S = ∑ i α i p D i ( u i p ) + σ ( ∑ i α i p ∑ i ( α i p ) 2 − 1 ) (7) \mathcal{F}_{\mathcal{S}}=\sum_{i} \alpha^p_{i} \mathcal{D}_{i}\left(u^p_i\right)+\sigma\left(\frac{\sum_{i} \alpha^p_{i}}{\sum_{i} {(\alpha^p_{i}})^{2}}-1\right)\tag7 FS=i∑αipDi(uip)+σ(∑i(αip)2∑iαip−1)(7)
The energy function consists of two terms :
1) The first one is : ∑ i α i p D i ( u i p ) \sum_{i} \alpha^p_{i} \mathcal{D}_{i}\left(u^p_i\right) ∑iαipDi(uip), among D i {D}_{i} Di Represent layers L i L_i Li The corresponding positive distribution , D i ( u i p ) \mathcal{D}_{i}\left(u^p_i\right) Di(uip) Indicates the desired color u i p u^p_i uip To The distribution of the Mahalanobis distance( Markov distance ), α i p \alpha^p_i αip Represents the desired opacity . The main purpose of this item is to find the layer in p p p As far as possible, the color of the point follows the estimated positive distribution , Multiply by a factor α i p \alpha^p_i αip Is to weight these distances , α i p \alpha^p_i αip The larger those layers are more important , Try to obey the positive distribution of these layers , because p p p The color of the point , Mainly determined by these layers with large opacity .
2) The second item : σ ( ∑ i α i p ∑ i ( α i p ) 2 − 1 ) \sigma\left(\frac{\sum_{i} \alpha^p_{i}}{\sum_{i} {(\alpha^p_{i}})^{2}}-1\right) σ(∑i(αip)2∑iαip−1) Is a sparsity constraint , σ \sigma σ It's a coefficient , Generally set as 10. From this equation we can see that , When p p p Point opacity set about multiple layers α p = { α 1 p , α 2 p , α 3 p , . . . α k p } \alpha^p = \{\alpha^p_{1},\alpha^p_{2},\alpha^p_{3},...\alpha^p_{k}\} αp={ α1p,α2p,α3p,...αkp}, If only one opacity is 1, The rest are all 0 when , The minimum value will be obtained 1, So that the formula (7) Of the 2 The value of the item 0. This makes the optimized opacity set sparse ( A small number of values >0), As said at the beginning of the article , This ensures good locality of image editing .
initialization : Investigate p p p The color of the point in the input image , Calculate the best matching layer ( Mahalanobis distance is the smallest ), Assuming that L j L_j Lj, Then initialize the color and opacity respectively :
u p = { 0 , 0 , 0 , . . . u j p = 1 , . . . 0 } , α p = { 0 , 0 , 0 , . . . α j p = 1 , . . . 0 } (8) u^p = \{0,0,0,...u^p_j=1,...0\}, \alpha^p = \{0,0,0,...\alpha^p_j=1,...0\}\tag8 up={ 0,0,0,...ujp=1,...0},αp={ 0,0,0,...αjp=1,...0}(8)
Pay attention to energy minimization , At the same time, we need to consider the formula (1) Reconstruction error shown , The formula (2) The convex combination constraint shown , The formula (3) Value range constraints shown .
Four 、 Results comparison and summary
Finally, compare the decomposition results of the previous layer . You can see , The layer extracted in this paper has good sparsity , For example, the orange in the image ( On the left 1 Xing di 4 Column ) Very well extracted , and Tan The method of extracting oranges is located in ; Two layers ( Right side 1 Xing di 2 Column , Right side 2 Xing di 1 Column ).
Simple summary :
1) The method is novel , Different from the general palette decomposition algorithm , Assume that each layer follows a positive distribution ;
2) The computational complexity of the algorithm is high , It takes several iterations , It is necessary to optimize the pixel representation sore;
3) Recolor is more complex than the general palette based layer decomposition algorithm , Because this layer is not monochrome , Need to export to PS Further recolor , More trouble .
4) Personally think that , The paper is not well written , It's hard to read , Hard to understand .
边栏推荐
- 如何在 ggplot2 中向绘图中添加表格
- Win10 computer power management turns on excellence mode
- Dreamcamera2 record key prompt sound into video during video recording
- 解读Oracle
- R 语言降维的 PCA 与自动编码器
- 【论文笔记】Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours
- Is it safe to open an account in flush online? How to open a brokerage account online
- MySQL增删查改(进阶)
- Analysis and optimization of ue5 global illumination system lumen
- Oracle exercise
猜你喜欢

How to add a regression equation to a plot in R

组件与路由

Utonmos adheres to the principle of "collection and copyright" to help the high-quality development of traditional culture

Additional: brief description of hikaricp connection pool; (I didn't go deep into it, but I had a basic understanding of hikaricp connection pool)

ORB-SLAM3论文概述

Clion项目中运行多个main函数

Fresh graduates talk about their graduation stories

Oracle连接问题以及解决方案

【解决】CMake was unable to find a build program corresponding to “Unix Makefiles“.

【论文笔记】Learning to Grasp with Primitive Shaped Object Policies
随机推荐
少儿编程对国内传统学科的推进作用
【解决】联想拯救者vmware启动虚拟机蓝屏重启问题
Stm32cubemx: watchdog ------ independent watchdog and window watchdog
Dreamcamera2 video recording, playing without sound, recording function is normal, using a third-party application for video recording, playing with sound
我在中山,到哪里开户比较好?网上开户是否安全么?
Arduino string to hexadecimal number for large color serial port screen.
Modifying table names, deleting tables, obtaining table information, and deleting table records on specified dates for common MySQL statements
js array数组json去重
Une citation classique de la nature humaine que vous ne pouvez pas ignorer
限制输入字符长度length英文1个字符中文2个字符
Wealth freedom skills: commercialize yourself
Fresh graduates talk about their graduation stories
在哪里开基金帐户安全?
学习太极创客 — MQTT(五)发布、订阅和取消订阅
Scratch returns 400
Camtasia 2022 new ultra clear recording computer video
2021-08-04
Redis configuration and optimization of NoSQL
golang--channal与select
[machinetranslation] - Calculation of Bleu value