当前位置：网站首页>Vins theory and code explanation 4 - initialization

Vins theory and code explanation 4 - initialization

2022-06-10 15:23:00 【Easier said than done wangshuailpp】

VINS Detailed explanation of theory and code 4—— initialization

1. Pure vision monocular initialization based on sliding window

Before introducing pure visual initialization, let's first talk about why to initialize ？ What to do in initialization ？ And initialization ？ The reason for our initialization is that the monocular inertial close coupled system is a highly nonlinear system , First of all, monocular is unable to obtain the absolute scale in space , and IMU And there must be bias , The gravitational acceleration is also needed in the later solution （ Including size and direction ）, Under speed sensitive conditions , Like drones , And accurate speed information , therefore , How to effectively calculate these quantities before processing the tightly coupled system , It is of great significance to the robustness of the whole tightly coupled system （ In fact, it can be understood as camera calibration , The internal parameters of the camera are not calibrated correctly , The camera must be inaccurate when positioning , And it's likely to hang up ）. So the initialization is actually very simple , Is to calculate the absolute scale s、 Gyroscope bias bg、 Acceleration offset ba、 Acceleration of gravity G And each IMU The speed of the moment v,VINS It is emphasized that the accelerometer offset value is generally coupled with the gravitational acceleration （ That is, it is absorbed by the acceleration of gravity ）, The magnitude of gravitational acceleration is much larger than its acceleration offset , And the accelerometer offset is relatively small in the initialization time , It's hard to really calculate , Therefore, the effect of accelerometer bias is ignored , No more calculations during initialization . The function of initialization is self-evident , It directly affects the robustness and positioning accuracy of the whole tightly coupled system , And initialization usually takes a long time ,VINS It will take about ten seconds ,ORB_SLAM2 combination IMU The time is set at 15 Seconds to complete initialization . Don't talk much , Go straight to the point .

Pure visual initialization in the V Dot A part , First build a sliding window , Contains a set of data frames . The use of antipolar geometry model is mentioned in the paper 5 The point method is used to solve the relative transformation of monocular camera , Including relative rotation and displacement without scale information . In fact, each monocular model basically uses epipolar geometry to solve the relative transformation of two frames in initialization , It should be noted that rotation is scale invariant （ It's actually unit rotation , There will be no scale information , Think about it carefully ？）. Then triangulate to get the corresponding 3d Point coordinates , There are these 3d Points and other frames in the sliding window 2d Click to proceed PNP Solve to obtain all the pose and feature points in the sliding window 3d coordinate , thus , Visual initialization is complete . Is it simple ？ Of course. , After all, it's just a simple visual initialization , What is really complicated is the visual inertia joint initialization , That is, the focus and difficulty of our initialization , So the following knowledge points must be spiritually learned ！

2. Visual inertia joint initialization

Visual inertia joint initialization at V Dot B part , The name of the definition given by the author here is Visual-Inertia Alignment, That is, visual inertia joint initialization （ And in the ORBSLAM2+IMU In my paper , The name defined by the author is IMU initialization, namely IMU initialization ）, Why define such a noun , I think it has two meanings , First, when initializing the gyro offset, it is necessary to use IMU Rotation of measurement and rotation of visual measurement , That is to combine visual and inertial data . Second, the scale obtained here S The value of is more than IMU Of , Or visual and IMU The scale of the whole system . During the specific explanation of each initialization process , It is necessary to make a general summary , The physical definition of initialization is actually the calibration of intrinsic parameters , The definition in the mathematical model is actually a formula （6） Solution of matrix equation , And the formula （6） In fact, it comes from the most primitive PVQ The integral formula , among Q The rotation corresponds to the gyroscope , and PV Corresponding to the accelerometer , If you don't understand , It doesn't matter , After reading the following overall derivation process, I believe that you will be enlightened if you are smart .

（1） Gyro bias calibration

Rotation can be obtained in two ways , One is the measured value of gyroscope , One is visual observation . According to the normal understanding, the two must be equal in size （ Assume no error ）, But there must be some errors in the actual situation , Let's look at the respective errors . The error of gyroscope has two parts: measurement noise and gyroscope bias , The noise can be ignored for the time being （ After all, it is too small ）, The only visual error is observation noise （ It can also be ignored ）, Therefore, the absolute value of the difference between the two is the gyro offset , All the rotations of the entire sliding window are adjusted to form a minimum error model ：

The formula 15 By making the generalized product of quaternion rotation of the first term and the second term of the first expression in, we can get the camera from bk To bk+1 Relative rotation under （bk+1 In coordinate system ）, The third item is the gyroscope from bk+1 To bk Relative rotation under （bk In coordinate system ）, The two are doing generalized product , It's from bk To bk+1 rotate , And then from bk+1 To bk rotate , It is equivalent to doing badly （OA+AO=0）, The second formula is the first-order linear approximation mentioned above . Then take the least square , You can also use it SVD Decomposition and other methods . Note that after the gyro offset is obtained, the gyro offset must be substituted into the pre integration again to calculate the value of the pre integration again , Will be more accurate .

（1） Speed 、 Gravitational acceleration and scaling

Here, the author unifies the three state quantities into a state vector , As formula 16 Shown ：

The speed is at bk In a coordinate system , The gravitational acceleration is in the initial camera coordinate system , As mentioned earlier , The solution of several quantities is given by P、V The mathematical model is obtained , Consider two consecutive keyframes in a sliding window bk and bk+1, The following is the formula in the paper 17 and 19 The derivation of ：

After the derivation of the formula, we will get the formula in the paper 17、18 and 19, Let's focus on why this is the case , And the derived equation of motion . First of all, why should we make such a derivation , It all depends on how the state vector is defined , The left side of the equation form we finally get must be expressed in the form of state vector , But also to meet other quantities are known （ from IMU Pre integration and visual tracking result in ）, So we need to change the equation like this , To satisfy such a relationship . Then there is the final form. We can see that the final formal dimension of the state vector is （n+1）*3+3+1, The dimension of the motion equation generated by two consecutive frames is 3+3+3+1（vbkbk,vbk+1bk+1,gc0,s）, By comparing dimensions, you can see the final H A matrix must be a positive definite symmetric matrix , Therefore, a fast Cholesky decompose .

（1） Gravity optimization

In fact, the magnitude and direction of gravitational acceleration have been obtained above , Why do we need to optimize gravity here ？ The reason is simple , The gravity calculated here absorbs the bias of the gravitational accelerometer , Although it is not necessary to calculate the bias of the gravity accelerometer , But gravity still needs to be optimized , When it comes to optimizing gravitational acceleration , It must contain two quantities , Size and direction , That is, three dimensions , But in general, the size is known （ Set here as 9.8）, So what we need to do is to optimize the direction , Is a two-dimensional vector , The following figure shows how to optimize gravity and b1,b2 The direction of the unit vector determines the model .

Code details ：

notes ： Upper connection IMU Preintegration

step 2： Processing image feature data

The data coming in here is not image data , Instead, we have tracked the matched normalized plane coordinates . Store the features of the current frame in image in ,image The first element type of is the camera number , Which frame represents the image （ from 0 Start ）, The second element is the normalized feature point coordinates and feature point numbers （ from 1 Start ）, Then it directly enters the thread of processing image feature data estimator.processImage(image, img_msg->header).

1）Estimator::processImage(constmap<int, vector<pair<int, Vector3d>>> &image, conststd_msgs::Header &header) Methods of processing image feature data

First, judge whether the incoming image feature data is a key frame according to the parallax , Select discard current frame （ But keep IMU data ） Or discard the oldest frame in the sliding window .

step 1： Save the image data and time into the image frame class ： First, save the data and time to the object of the image frame imageframe in （ImageFrame The object contains feature points , Time , Postures R,t, Pre integration object pre_integration, Is it a keyframe ）, At the same time, save the temporary pre product score to this object （ The initial value of the temporary pre integration here is just above IMU It's calculated during pre integration ）, Then the object of the image frame imageframe Save to all_image_frame In the object （imageframe The container of ）, Update the initial value of temporary pre integration .

step 2： Calibrate the camera and IMU External parameters of ： Then, if there are no external parameters, calibrate the external parameters , If parameters are passed, skip this step （ The default is , If it's your own equipment , It can be set to 2 Online calibration of external parameters ）.

step 3： Initialize the system at the same time BA Optimize ： When the solver is in an initializable state （ The initial state is initializable , If the initialization succeeds, it will be set to the uninitialized state ）, Judge the present frame_count Whether to achieve WINDOW_SIZE, Make sure there is enough frame Participate in initialization , there frame_count Is the number of image frames in the sliding window , Initially initialized to 0, The total number of sliding window frames is 10. There are external parameters and the current frame timestamp is greater than the initialization timestamp 0.1 second , Do the initialization operation .

step 3.1：initialStructure() System initialization , Initialize first Vision-only SFM, Then initialize Visual-Inertial Alignment, Constitute the entire initialization process .

① Guarantee IMU Get plenty of exercise , Judge by linear acceleration , The standard deviation of the linear acceleration at the beginning （ The degree of dispersion ） Judgment guarantees IMU Get plenty of exercise , The standard deviation of acceleration is greater than 0.25 Then represent imu Fully motivate , Enough to initialize .

② Pure visual initialization , Yes SlidingWindow Image frame and camera pose in sfm problem , The solution here is Keyframes The pose and feature point coordinates of .

step 1. First, build. SFMFeature object sfm_f,SFMFeature The array contains the status of the feature points （ Whether it is triangulated ）,id,2d spot ,3d Coordinates and depth , Save the feature information in the feature manager to SFMFeature object sfm_f in sfm_f.push_back(tmp_feature).

step 2. Then, by the F The matrix is restored to R、t, The main call method relativePose(relative_R, relative_T, l).relativePose Method first through FeatureManeger obtain （ Slide in window ） The first i Feature matching between frame and last frame corres, When corres When the match is large enough , Review the latest keyFrame and sliding window One of them keyFrame There is enough feature Matching and parallax large enough （id by l=i）, Meet these two conditions , And then, between these two frames, we can recover the R,t And triangulate out 3D Characteristics of feature point, Here is the use of solveRelativeRT(corres, relative_R, relative_T),solveRelativeRT Method defined in solv_5pts.cpp Class , By the F The matrix is restored to R、t, Call directly opencv The method in , There's nothing to say , Here's what's interesting , such relativePose The resulting pose is the l The frame of , The first l The filtering of frames is from the first frame to all frames in the sliding window that meet the requirement of a large average parallax at the beginning , Here's number l The frame will be used as a reference frame to the following global SFM Use . Here we have got the feature points of the image 2d Coordinate extraction , Camera No l Rotation and translation between frame and last frame （ Note that there are no features yet 3d Point coordinates ）, With this information, you can build a global SFM class GlobalSFM sfm, Call it here sfm.construct(frame_count + 1, Q, T,l,relative_R, relative_T,sfm_f, sfm_tracked_points), Here is the first l Frame as reference frame , It's going on PNP Before we solve it , It is necessary to judge that the current number of frames should be greater than the number of l frame , This guarantees the second l Frame skip PNP step , First, execute the following step l Triangulation of frame and last frame , Get the characteristic points of the common view , For the following l+1 Frame and the last frame PNP, And then use it pnp solve l+1 Frame to the last frame R_initial, P_initial, The last pose is saved in Pose in , A loop , obtain l+1,l+2…n-1 Frame pose . Jump out step 2 After the cycle , So far we have l+1,l+2…n-1 Frame pose and l+1,l+2… Frame n-1 Triangulate the feature points of the frame . Then triangulate l The frame and i frame （ In the l Frame between frame and last frame ） Between 3d coordinate ,（ I don't know why I have to do it twice , Is it possible to triangulate more feature points ？？？？）, next PNP solve l-1,l-2…0 The frame and l The pose between frames has been triangulated to the corresponding feature point coordinates , Finally triangulate all other feature points . So far, the pose of all cameras in the sliding window and the position of feature points are obtained 3d coordinate . The first 6 Department is to carry out BA Optimize , It uses ceres Optimize pose and feature points , Here you can refer to the vision SLAM The content in Lecture 10 , The optimization method is the same .

step 4：visualInitialAlign Call in VisualIMUAlignment Method , Real visual inertia joint initialization ,imu Align with vision , Obtain absolute scale, etc . This method is defined in initial/initial_alignment.h in .

step 4.1：solveGyroscopeBias Calculate the gyro offset , The calculation model of the whole method is given in the paper , Use LTLD Methods to solve the least square problem ,delta_bg = A.ldlt().solve(b); here A +=tmp_A.transpose() * tmp_A,b += tmp_A.transpose() * tmp_b, In fact, it's dealing with AT*A*x=AT*b problem , The general least square problem is dealt with directly Ax=b That is to say Ax-b=0 that will do , Here is the use of LDLT Method , Multiply both sides by A The transpose of a matrix AT*A It must be reversible , So we can multiply the two sides by their inverse , The corresponding instructions are detailed in LDLT Method . After obtaining the gyro offset, save its value to the previously defined Bgs[] in , Finally, we are recalculating the pre integral .

step 4.2：LinearAlignment Calculation scale , Gravitational acceleration and velocity . The formula given in this paper is the model of two adjacent velocities , Map to the entire n+1 Of the four velocity models ,A A matrix must be a positive definite matrix （ Real symmetric matrix ）, Defined in code A and b That is, the most general H and b,tmp_A and tmp_b A temporary variable between adjacent speeds . The final solution ：x = A.ldlt().solve(b); And then call RefineGravity Recalculate the direction of gravitational acceleration , Get the best solution .

原网站

版权声明
本文为[Easier said than done wangshuailpp]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/161/202206101508090169.html

当前位置：网站首页>Vins theory and code explanation 4 - initialization

Vins theory and code explanation 4 - initialization

边栏推荐

猜你喜欢

随机推荐