当前位置:网站首页>[code practice] [stereo matching series] Classic ad census: (5) scan line optimization
[code practice] [stereo matching series] Classic ad census: (5) scan line optimization
2022-07-05 08:53:00 【Li Yingsong~】
Happy National Day holiday, students ! Natural and unrestrained 7 God ( Bring baby 7 God ), It's hard to sit down and update your blog .
Download the complete source code , Click to enter : https://github.com/ethan-li-coding/AD-Census
Welcome to Github Discuss in the project !
Next chapter Cross domain cost aggregation , The content of this article is AD-Census Scan line optimization steps , actually , The idea of this step and SGM The code aggregation of is basically the same , But in P1/P2 Some modifications have been made to the parameter settings . exactly ,SGM Of P1、P2 Setting the policy is too simple , The advantage is high robustness , We can get a good parallax result for most data , But the obvious drawback is that it is difficult to find a particularly good combination of parameters , Make the data of specific application scenarios reach a relatively perfect state ,P1/P2 The setting of is critical to the overall parallax effect, especially the parallax at the edge , therefore AD-Census The improvement direction of is of practical significance .
We might as well take a look directly first AD-Census The result of scan line optimization :
![]() | ![]() | ![]() | ![]() |
obviously , The parallax map after scan line optimization is more complete than that after cost aggregation , Less error value . Of course, that doesn 't mean AD-Census The parameter improvement of is effective , It only shows that the scan line optimization steps are effective .
Let's take a look at the coding introduction !
List of articles
Algorithm
alike , Please see the blog for the principle of Algorithm :
classic AD-Census: (3) Scan line optimization (Scanline Optimization)
Here I won't talk about the principle of optimization , and SGM(SemiGlobalMatching) The cost aggregation strategy is exactly the same , Just read the previous blogs of bloggers ,AD-Census use 4 Scan line optimization of direction , That is, up and down, around 4 A direction .
AD-Census The changes made are P 1 P_1 P1 and P 2 P_2 P2 Setting method of value , stay SGM in , P 1 P_1 P1、 P 2 ′ P_2' P2′ Is a preset fixed value , Actually used P 2 P_2 P2 It is adjusted in real time according to the brightness difference between two adjacent pixels in the left view , The adjustment formula is P 2 = P 2 ′ / ( I p − I q ) P_2=P_2'/(I_p-I_q) P2=P2′/(Ip−Iq).
And in the Ad-Census in , P 1 P_1 P1、 P 2 P_2 P2 Not just the color difference from the adjacent pixels in the left view D 1 = D c ( p , p − r ) D_1=D_c(p,p-r) D1=Dc(p,p−r) of , The color of adjacent pixels corresponds to the color difference of adjacent pixels with the same name D 2 = D c ( p d , p d − r ) D_2=D_c(pd,pd-r) D2=Dc(pd,pd−r) of .
( notes 1:AD-Census Algorithm default input color map , So it's a color difference , If it is an input grayscale image , Is the brightness difference , The definition of color difference is D c ( p l , p ) = m a x i = R , G , B ∣ I i ( p l ) − I i ( p ) ∣ D_c(p_l,p)=max_{i=R,G,B}|I_i(p_l)-I_i(p)| Dc(pl,p)=maxi=R,G,B∣Ii(pl)−Ii(p)∣, That is, the maximum value of the difference between the three color components )
( notes 2: p d pd pd It's actually pixels p p p Through parallax d d d Found a point with the same name on the right view of q = p − d q=p-d q=p−d)
( notes 3: p − r p-r p−r Represents the last pixel in the aggregation direction , For example, aggregate from left to right , be p − r p-r p−r Namely p − 1 p-1 p−1; From right to left , be p − r p-r p−r Namely p + 1 p+1 p+1)
The specific setting rules are as follows :
- P 1 = Π 1 , P 2 = Π 2 , i f D 1 < τ S O , D 2 < τ S O P_1=Π_1,P_2=Π_2, if D_1<τ_{SO},D_2<τ_{SO} P1=Π1,P2=Π2,ifD1<τSO,D2<τSO
- P 1 = Π 1 / 4 , P 2 = Π 2 / 4 , i f D 1 < τ S O , D 2 > τ S O P_1=Π_1/4,P_2=Π_2/4, if D_1<τ_{SO},D_2>τ_{SO} P1=Π1/4,P2=Π2/4,ifD1<τSO,D2>τSO
- P 1 = Π 1 / 4 , P 2 = Π 2 / 4 , i f D 1 > τ S O , D 2 < τ S O P_1=Π_1/4,P_2=Π_2/4, if D_1>τ_{SO},D_2<τ_{SO} P1=Π1/4,P2=Π2/4,ifD1>τSO,D2<τSO
- P 1 = Π 1 / 10 , P 2 = Π 2 / 10 , i f D 1 > τ S O , D 2 > τ S O P_1=Π_1/10,P_2=Π_2/10, if D_1>τ_{SO},D_2>τ_{SO} P1=Π1/10,P2=Π2/10,ifD1>τSO,D2>τSO
Π 1 , Π 2 Π_1,Π_2 Π1,Π2 Is the set fixed threshold , τ S O τ_{SO} τSO Is the set color difference threshold .
Code implementation
Class design
Member functions
Again , We use a scanline optimizer class ScanlineOptimizer To achieve this function . Put it in the file scanline_optimizer.h/scanline_optimizer.cpp in .
/** * \brief Scan line optimizer */
class ScanlineOptimizer {
public:
ScanlineOptimizer();
~ScanlineOptimizer();
}
In the design of public member functions , The first type of interface is essential Set up the data SetData as well as Set parameters SetParam , Complete the input of the algorithm . The second is to optimize the function interface Optimize .
And the specific optimization sub steps , We put it in the private member function list , Including horizontal aggregation CostAggregateLeftRight And vertical convergence CostAggregateUpDown.
meanwhile , Algorithm needs a small function color distance calculation function ColorDist, Also in private functions .
The declaration code of all member functions is as follows :
public:
ScanlineOptimizer();
~ScanlineOptimizer();
/** * \brief Set up the data * \param img_left // Left image data , Three channels * \param img_right // Right image data , Three channels * \param cost_init // Initial cost array * \param cost_aggr // Aggregate cost array */
void SetData(const uint8* img_left, const uint8* img_right, float32* cost_init, float32* cost_aggr);
/** * \brief * \param width // The image is wide * \param height // Image height * \param min_disparity // Minimum parallax * \param max_disparity // Maximum parallax * \param p1 // p1 * \param p2 // p2 * \param tso // tso */
void SetParam(const sint32& width,const sint32& height, const sint32& min_disparity, const sint32& max_disparity, const float32& p1, const float32& p2, const sint32& tso);
/** * \brief Optimize */
void Optimize();
private:
/** * \brief Left and right path aggregation → ← * \param cost_so_src Input ,SO Previous cost data * \param cost_so_dst Output ,SO Offspring price data * \param is_forward Input , Is it positive ( The positive direction is from left to right , The reverse direction is from right to left ) */
void CostAggregateLeftRight(const float32* cost_so_src, float32* cost_so_dst, bool is_forward = true);
/** * \brief Up and down path aggregation ↓ ↑ * \param cost_so_src Input ,SO Previous cost data * \param cost_so_dst Output ,SO Offspring price data * \param is_forward Input , Is it positive ( The positive direction is from top to bottom , The opposite direction is from bottom to top ) */
void CostAggregateUpDown(const float32* cost_so_src, float32* cost_so_dst, bool is_forward = true);
/** \brief Calculate the color distance */
inline sint32 ColorDist(const ADColor& c1, const ADColor& c2) {
return std::max(abs(c1.r - c2.r), std::max(abs(c1.g - c2.g), abs(c1.b - c2.b)));
}
Write clear comments for each function , Easy to understand quickly . In addition, the function to calculate the color distance is an inline function , Declaration also defines and implements it .
Member variables
All member variables are designed to be private , Only used inside the algorithm , They are image sizes 、 Image data 、 Cost data ( initial / polymerization )、 Algorithm parameters, etc .
private:
/** \brief Image size */
sint32 width_;
sint32 height_;
/** \brief Image data */
const uint8* img_left_;
const uint8* img_right_;
/** \brief Initial cost array */
float32* cost_init_;
/** \brief Aggregate cost array */
float32* cost_aggr_;
/** \brief Minimum parallax value */
sint32 min_disparity_;
/** \brief Maximum parallax value */
sint32 max_disparity_;
/** \brief Initial p1 value */
float32 so_p1_;
/** \brief Initial p2 value */
float32 so_p2_;
/** \brief tso threshold */
sint32 so_tso_;
Class implementation
because SetData and SetParam Relatively simple , There is also very little code , So I won't introduce it , You can understand it by reading the code . Here are two sub steps of scan line optimization CostAggregateLeftRight and CostAggregateUpDown.
actually , I'm going straight to SGM The cost of aggregation code moved over , modify P 1 P_1 P1 and P 2 P_2 P2 Just calculate the value . as follows :
void ScanlineOptimizer::CostAggregateLeftRight(const float32* cost_so_src, float32* cost_so_dst, bool is_forward)
{
const auto width = width_;
const auto height = height_;
const auto min_disparity = min_disparity_;
const auto max_disparity = max_disparity_;
const auto p1 = so_p1_;
const auto p2 = so_p2_;
const auto tso = so_tso_;
assert(width > 0 && height > 0 && max_disparity > min_disparity);
// Parallax range
const sint32 disp_range = max_disparity - min_disparity;
// positive ( Left -> Right ) :is_forward = true ; direction = 1
// reverse ( Right -> Left ) :is_forward = false; direction = -1;
const sint32 direction = is_forward ? 1 : -1;
// polymerization
for (sint32 y = 0u; y < height; y++) {
// The path header is the first of each line ( tail ,dir=-1) Column pixel
auto cost_init_row = (is_forward) ? (cost_so_src + y * width * disp_range) : (cost_so_src + y * width * disp_range + (width - 1) * disp_range);
auto cost_aggr_row = (is_forward) ? (cost_so_dst + y * width * disp_range) : (cost_so_dst + y * width * disp_range + (width - 1) * disp_range);
auto img_row = (is_forward) ? (img_left_ + y * width * 3) : (img_left_ + y * width * 3 + 3 * (width - 1));
const auto img_row_r = img_right_ + y * width * 3;
sint32 x = (is_forward) ? 0 : width - 1;
// The current color value and the previous color value on the path
ADColor color(img_row[0], img_row[1], img_row[2]);
ADColor color_last = color;
// The cost array of the last pixel on the path , Two more elements to avoid boundary overflow ( One more at the beginning and one more at the end )
std::vector<float32> cost_last_path(disp_range + 2, Large_Float);
// initialization : The aggregate generation value of the first pixel is equal to the initial generation value
memcpy(cost_aggr_row, cost_init_row, disp_range * sizeof(float32));
memcpy(&cost_last_path[1], cost_aggr_row, disp_range * sizeof(float32));
cost_init_row += direction * disp_range;
cost_aggr_row += direction * disp_range;
img_row += direction * 3;
x += direction;
// The minimum generation value of the last pixel on the path
float32 mincost_last_path = Large_Float;
for (auto cost : cost_last_path) {
mincost_last_path = std::min(mincost_last_path, cost);
}
// From... In direction 2 Pixels start to aggregate in order
for (sint32 j = 0; j < width - 1; j++) {
color = ADColor(img_row[0], img_row[1], img_row[2]);
const uint8 d1 = ColorDist(color, color_last);
uint8 d2 = d1;
float32 min_cost = Large_Float;
for (sint32 d = 0; d < disp_range; d++) {
const sint32 xr = x - d;
if (xr > 0 && xr < width - 1) {
const ADColor color_r = ADColor(img_row_r[3 * xr], img_row_r[3 * xr + 1], img_row_r[3 * xr + 2]);
const ADColor color_last_r = ADColor(img_row_r[3 * (xr - direction)],
img_row_r[3 * (xr - direction) + 1],
img_row_r[3 * (xr - direction) + 2]);
d2 = ColorDist(color_r, color_last_r);
}
// Calculation P1 and P2
float32 P1(0.0f), P2(0.0f);
if (d1 < tso && d2 < tso) {
P1 = p1; P2 = p2;
}
else if (d1 < tso && d2 >= tso) {
P1 = p1 / 4; P2 = p2 / 4;
}
else if (d1 >= tso && d2 < tso) {
P1 = p1 / 4; P2 = p2 / 4;
}
else if (d1 >= tso && d2 >= tso) {
P1 = p1 / 10; P2 = p2 / 10;
}
// Lr(p,d) = C(p,d) + min( Lr(p-r,d), Lr(p-r,d-1) + P1, Lr(p-r,d+1) + P1, min(Lr(p-r))+P2 ) - min(Lr(p-r))
const float32 cost = cost_init_row[d];
const float32 l1 = cost_last_path[d + 1];
const float32 l2 = cost_last_path[d] + P1;
const float32 l3 = cost_last_path[d + 2] + P1;
const float32 l4 = mincost_last_path + P2;
float32 cost_s = cost + static_cast<float32>(std::min(std::min(l1, l2), std::min(l3, l4)));
cost_s /= 2;
cost_aggr_row[d] = cost_s;
min_cost = std::min(min_cost, cost_s);
}
// Reset the minimum generation value and cost array of the previous pixel
mincost_last_path = min_cost;
memcpy(&cost_last_path[1], cost_aggr_row, disp_range * sizeof(float32));
// Next pixel
cost_init_row += direction * disp_range;
cost_aggr_row += direction * disp_range;
img_row += direction * 3;
x += direction;
// Reassign pixel values
color_last = color;
}
}
}
If you don't know the aggregation code , You can see my previous blog :
In this article, we will focus on P1 and P2 Calculation method of :
Let's start with each pixel , Calculate the color distance between it and the previous pixel on the left view ( Color difference ) d 1 d_1 d1:
const uint8 d1 = ColorDist(color, color_last);
Then when traversing each parallax of pixels , Calculate the color distance between the corresponding pixel of the right view and its previous pixel d 2 d_2 d2.
const sint32 xr = x - d;
if (xr > 0 && xr < width - 1) {
const ADColor color_r = ADColor(img_row_r[3 * xr], img_row_r[3 * xr + 1], img_row_r[3 * xr + 2]);
const ADColor color_last_r = ADColor(img_row_r[3 * (xr - direction)],
img_row_r[3 * (xr - direction) + 1],
img_row_r[3 * (xr - direction) + 2]);
d2 = ColorDist(color_r, color_last_r);
}
Next, according to d 1 d_1 d1 and d 2 d_2 d2 Comparison with threshold , It is judged as one of four situations , Calculation P1 and P2 Value .
// Calculation P1 and P2
float32 P1(0.0f), P2(0.0f);
if (d1 < tso && d2 < tso) {
P1 = p1; P2 = p2;
}
else if (d1 < tso && d2 >= tso) {
P1 = p1 / 4; P2 = p2 / 4;
}
else if (d1 >= tso && d2 < tso) {
P1 = p1 / 4; P2 = p2 / 4;
}
else if (d1 >= tso && d2 >= tso) {
P1 = p1 / 10; P2 = p2 / 10;
}
among , Lowercase p1、p2, as well as tso Are all input algorithm parameters .
const auto p1 = so_p1_;
const auto p2 = so_p2_;
const auto tso = so_tso_;
I won't post the code in the vertical direction , Except in different directions , There is no other difference with the horizontal direction , Draw a gourd and a gourd .
Public optimization interface Optimize Inside , Just call the optimization functions in four directions in turn .
void ScanlineOptimizer::Optimize()
{
if (width_ <= 0 || height_ <= 0 ||
img_left_ == nullptr || img_right_ == nullptr ||
cost_init_ == nullptr || cost_aggr_ == nullptr) {
return;
}
// 4 Directional scan line optimization
// The first input of the module is the data after the cost aggregation in the previous step , That is to say cost_aggr_
// We optimize the four directions in order , And make use of cost_init_ And cost_aggr_ Save temporary data between times , In this way, there is no need to open up additional memory to store intermediate results
// The final output of the module is also cost_aggr_
// left to right
CostAggregateLeftRight(cost_aggr_, cost_init_, true);
// right to left
CostAggregateLeftRight(cost_init_, cost_aggr_, false);
// up to down
CostAggregateUpDown(cost_aggr_, cost_init_, true);
// down to up
CostAggregateUpDown(cost_init_, cost_aggr_, false);
}
Here's a little trick , That is, alternate use cost_aggr and cost_init, There is no need to open up an additional cost array in four directions , The whole optimization operation is completed with only two cost data .
experiment
We did three groups of experiments , One group is to optimize the scanning line only in the left and right horizontal directions , One group is to optimize the scanning line in the vertical direction , The remaining group is to optimize in four directions . Let's see the effect .
![]() | ![]() | ![]() | ![]() |
it seems , Only do horizontal or vertical optimization , The disparity map has been significantly improved , However, there will be directional fringe effect in unidirectional optimization , and 4 The optimization result of direction can eliminate this phenomenon , Reach a better state .
Last , Let's post the experimental picture at the beginning of the article :
![]() | ![]() | ![]() | ![]() |
Okay , This is the end of this article , The next article will bring you post-processing part . Thank you for watching. !
download AD-Census Complete source code , Click to enter : https://github.com/ethan-li-coding/AD-Census
Welcome to Github Discuss in the project , If you think the blogger's code quality is good , There is a star in the upper right corner ! thank !
About bloggers :
Ethan Li Li Yingsong ( You know : Li Yingsong )
Wuhan University Doctor of photogrammetry and remote sensing
Main direction Stereo matching 、 Three dimensional reconstruction
2019 Won the first prize of scientific and technological progress in surveying and mapping in ( Provincial and ministerial level )
Love 3D , Love sharing , Love open source
GitHub: https://github.com/ethan-li-coding ( welcome follow and star)
Personal wechat :
Welcome to exchange !
Pay attention to bloggers and don't get lost , thank !
Blog home page :https://ethanli.blog.csdn.net
边栏推荐
- [daily training -- Tencent selected 50] 557 Reverse word III in string
- Golang foundation - the time data inserted by golang into MySQL is inconsistent with the local time
- Solution to the problem of the 10th Programming Competition (synchronized competition) of Harbin University of technology "Colin Minglun Cup"
- Guess riddles (9)
- Basic number theory - fast power
- An enterprise information integration system
- File server migration scheme of a company
- Task failed task_ 1641530057069_ 0002_ m_ 000000
- Halcon Chinese character recognition
- Search data in geo database
猜你喜欢
The combination of deep learning model and wet experiment is expected to be used for metabolic flux analysis
资源变现小程序添加折扣充值和折扣影票插件
Business modeling of software model | stakeholders
Guess riddles (142)
Classification of plastic surgery: short in long long long
Numpy pit: after the addition of dimension (n, 1) and dimension (n,) array, the dimension becomes (n, n)
[牛客网刷题 Day4] JZ55 二叉树的深度
我从技术到产品经理的几点体会
整形的分类:short in long longlong
Redis实现高性能的全文搜索引擎---RediSearch
随机推荐
C#【必备技能篇】ConfigurationManager 类的使用(文件App.config的使用)
Basic number theory - factors
kubeadm系列-00-overview
Numpy 小坑:维度 (n, 1) 和 维度 (n, ) 数组相加运算后维度变为 (n, n)
Business modeling of software model | object modeling
Redis implements a high-performance full-text search engine -- redisearch
[matlab] matlab reads and writes Excel
优先级队列(堆)
Solutions of ordinary differential equations (2) examples
Beautiful soup parsing and extracting data
[daiy4] copy of JZ35 complex linked list
Guess riddles (7)
[牛客网刷题 Day4] JZ35 复杂链表的复制
Oracle advanced (III) detailed explanation of data dictionary
RT thread kernel quick start, kernel implementation and application development learning with notes
猜谜语啦(142)
Hello everyone, welcome to my CSDN blog!
2020 "Lenovo Cup" National College programming online Invitational Competition and the third Shanghai University of technology programming competition
AUTOSAR从入门到精通100讲(103)-dbc文件的格式以及创建详解
Pearson correlation coefficient