当前位置：网站首页>Detailed derivation of perspective projection transformation and related applications

Detailed derivation of perspective projection transformation and related applications

2022-06-10 17:06:00 【Hua Weiyun】

because GAMES101 And LearnOpenGL In the course , Transform perspective projection 、 Viewport transform 、 Normal transformation is relatively simple , So this paper summarizes a lot of data , Hand pushed the relevant formula , It can be said that this is the most detailed information that can be seen on the Internet 、 Rigorous derivation process ~~One of~~ （ modesty ）.

The process of projecting a model onto the screen is as follows ：

One 、 Derivation of perspective projection process

1.1 Derivation of orthogonal projection matrix

The main process of orthogonal projection is to translate the center point to the origin , Rescale to $cube[-1,1]^3$

Orthogonal projection itself is relatively less used , The purpose of derivation here is to prepare for the subsequent perspective projection transformation .

1.1.1 The derivation process of pure right-handed coordinate system

explain ：

The unification here is in the right-hand coordinate system , namely z The shaft always faces outward .
here n、f All in z Coordinate value on the negative half axis of the axis , All are negative .
For the convenience of writing, we call this kind of Standard right hand coordinate system ,GAMES101 Which is derived in this case .

among , The translation transformation matrix is ：

$T= \begin{bmatrix} 1 & 0 & 0 & -\frac{r+l}{2} \\ 0 & 1 & 0 & -\frac{t+b}{2} \\ 0 & 0 & 1 & -\frac{f+n}{2} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{ Translation transformation matrix }$

Scaling transformation matrix ( Note that the side length here is 2) by ：

$S= \begin{bmatrix} \frac{2}{r-l} & 0 & 0 & 0 \\ 0 & \frac{2}{t - b} & 0 & 0 \\ 0 & 0 & \frac{2}{n - f} & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{ Scaling transformation matrix }$

therefore , Orthogonal projection matrix $M_{ortho} = S*T$ :

$M_{ortho} = \begin{bmatrix} \frac{2}{r-l} & 0 & 0 & -\frac{r+l}{r-l} \\ 0 & \frac{2}{t - b} & 0 & -\frac{t+b}{t-b} \\ 0 & 0 & \frac{2}{n - f} & -\frac{f+n}{n - f} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{ Standard right hand orthogonal projection matrix }$

If the visual vertebral body is up and down 、 Right and left symmetry （r=-l, t=-b, r+l=0, t+b=0） Of , be $M_{ortho}$ The matrix can be reduced to ：

$M_{ortho} = \begin{bmatrix} \frac{1}{r} & 0 & 0 & 0 \\ 0 & \frac{1}{t} & 0 & 0 \\ 0 & 0 & \frac{2}{n - f} & -\frac{f+n}{n - f} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{ Simplified right hand orthogonal projection matrix }$

1.1.2 OpenGL Derivation of orthogonal projection transformation in

stay OpenGL China needs Particular attention ：

The observation space coordinate system is the right-hand system , and NDC in +z Axially inward , For the left hand system . Therefore, when scaling, it is no longer the case that n Mapping to 1,f Mapping to -1 It is [f,n] Mapping to NDC Medium [1,-1]
stay OpenGL in n、f They are defined as Near plane and Far plane The distance value of , All positive numbers. , Therefore, we need to add a minus sign before substituting into the matrix to indicate the correct coordinate value .
For the convenience of writing, this situation is collectively referred to as OpenGL Coordinate system .

At this point, the translation transformation matrix becomes ：

$T= \begin{bmatrix} 1 & 0 & 0 & -\frac{r+l}{2} \\ 0 & 1 & 0 & -\frac{t+b}{2} \\ 0 & 0 & 1 & \frac{f+n}{2} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{OpenGL Translation transformation matrix }$

Scale the transformation matrix to ：

Note that there f、n There is a flip , by $\frac{2}{(-n)-(-f)}$ Flip to $-\frac{2}{(-n)-(-f)}$ , The final result is as follows ：

$S= \begin{bmatrix} \frac{2}{r-l} & 0 & 0 & 0 \\ 0 & \frac{2}{t - b} & 0 & 0 \\ 0 & 0 & \frac{2}{n - f} & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{OpenGL Scaling transformation matrix }$

It can also be understood here in a more mathematical form as z The axis has been turned once , It is equivalent to the left multiplication of a flip matrix in the transformation process $\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$

therefore ,OpenGL Orthogonal projection matrix $M_{ortho} = S*T$ :

$M_{ortho} = \begin{bmatrix} \frac{2}{r-l} & 0 & 0 & -\frac{r+l}{r-l} \\ 0 & \frac{2}{t - b} & 0 & -\frac{t+b}{t-b} \\ 0 & 0 & \frac{2}{n - f} & \frac{f+n}{n - f} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{OpenGL Orthogonal projection matrix }$

A simplified version is also available OpenGL Orthogonal projection coordinate system

$M_{ortho} = \begin{bmatrix} \frac{1}{r} & 0 & 0 & 0 \\ 0 & \frac{1}{t} & 0 & 0 \\ 0 & 0 & \frac{2}{n - f} & \frac{f+n}{n - f} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{ simplify OpenGL Orthogonal projection matrix }$

1.2 Perspective projection matrix derivation

For perspective projection , There are two steps ：

First ,“ Squash ” Look at the cone as a cuboid （n->n,f->f）（ $M_{persp−>ortho}$ ）;
Be careful ： Here it is. Right hand coordinate system The following is the derivation , No longer consider the problem of coordinate transformation , But according to n、f Positive and negative conditions of There are still two derivations .
then , Use the orthogonal projection matrix above to get the final result .

Here mainly with n、f Is a positive number , use -n、-f In the case of substituting coordinates , The process is as follows ：

In the process of compressing the visual cone into a cuboid , We have three principles ：

The coordinates of all points in the near plane remain unchanged
Coordinates of all points in the far plane z The value remains the same All are -f
The coordinate value of the center point of the far plane remains unchanged by (0,0,-f)

The following figure shows the points in the observation space $(x_e, y_e, z_e)$ Is projected to the near plane $(x_p, y_p, z_p)$ On the situation ,

From the top view of the viewing cone （Top View of Frustum） see , Observe the of space $x_e$ Mapping to $x_p$ , Using the ratio of similar triangles, we can get ：

$\frac{x_p}{x_e} = \frac{-n}{z_e}, x_p = \frac{n \cdot x_e}{-z_e}$

Similarly, through the side view (Side View of Frustum) You can get ：

$\frac{y_p}{y_e} = \frac{-n}{z_e} , y_p = \frac{n \cdot y_e}{-z_e}$

So for any point in the optic vertebra $(x_e, y_e, z_e, 1)$ , Its coordinates after the visual vertebral body is compressed into a cuboid should be $(\frac{n*x_e}{-z_e}, \frac{n*y_e}{-z_e}, unknown,1)$ , That is, we need a matrix $M_{persp->ortho}$ Let the following formula hold ：

$\begin{bmatrix} \frac{n*x_e}{-z_e} \\ \frac{n*y_e}{-z_e}\\ unknown \\ 1 \end{bmatrix} = M_{persp->ortho} * \begin{bmatrix} x_{e} \\ y_{e} \\ z_{e} \\ 1 \end{bmatrix}$

hypothesis $M_{persp->ortho}$ The first behavior of the matrix A,B,C,D. You can get the equation : $Ax_e+By_e+Cz_e+D = \frac{n*x_e}{-z_e}$ , This equation is not easy to solve , If you allow A = $\frac{n}{-z_e}$ , Others are equal to 0, You can really get results , But the value of the matrix should be constant , and $\frac{n}{-z_e}$ It's a variable .

Similarly, for the second row of a matrix, find $y_e$ You will also encounter the above problems .

therefore , Here we need to use the properties of the secondary coordinates to solve . It is known that $(x,y,z,1) And (kx,ky,kz,k!=0)$ These two points are completely equivalent , So here we uniformly multiply the left side of the above equation by $-z_e$ It can be rewritten as follows ：

$\begin{bmatrix} {nx_e} \\ {ny_e} \\ unknown \\ -z_e \end{bmatrix} = M_{persp->ortho} * \begin{bmatrix} x_{e} \\ y_{e} \\ z_{e} \\ 1 \end{bmatrix}$

At this point, it is easy to launch ：

$M_{persp->ortho}$ The first line of the matrix ：Ax+By+Cz+D = nx, Find out A=n,B=C=D=0

The second line ：Ex+Fy+Gz+H = ny, Find out F=n,E=G=H=0

In the fourth row ：Mx+Ny+Oz+P = z, Find out O=-1,M=N=P=0

thus , Only the third line element is still unknown , At this point, I will go back to the three principles stipulated above :

First principle ： The coordinates of all points in the near plane remain unchanged
That is to say $(x_e,y_e,-n,1)$ Through the matrix $M_{persp->ortho}$ After transformation , It should still be equal to $(x_e,y_e,-n,1)$ . Substitute it into the equation to get ：

$\begin{bmatrix} {x_e} \\ {y_e} \\ -n \\ 1 \end{bmatrix} = \begin{bmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ ? & ? & ? & ? \\ 0 & 0 & -1 & 0 \end{bmatrix} * \begin{bmatrix} x_{e} \\ y_{e} \\ -n \\ 1 \end{bmatrix}$

For the first 、 Two 、 Four lines , We write the derivation equation ：

nx+0y+0n+0*1=x

0x+ny+0n+0*1=y

0x+0y+n+0*1=1

here ,n It should be any constant , But now only in n be equal to 1 when , One 、 Two 、 Four lines of operation , This is not a reasonable solution .

So here again Homogeneous coordinate system The left side of the equation is multiplied by n, obtain ：

$\begin{bmatrix} n{x_e} \\ n{y_e} \\ -n^2 \\ n \end{bmatrix} = \begin{bmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ ? & ? & ? & ? \\ 0 & 0 & -1 & 0 \end{bmatrix} * \begin{bmatrix} x_{e} \\ y_{e} \\ -n \\ 1 \end{bmatrix}$

At this point, the first 、 Two 、 Four lines , Write the derivation equation ：

nx+0y+0n+0*1=nx,

0x+ny+0n+0*1=ny,

0x+0y+n+0*1=n

It is found that the left and right sides of the above three equations are equal , And right n No restrictions , At this point, let the four numbers in the third row be A、B、C、D Available ：Ax+By-Cn+D = -n^2,

It is not difficult to find the right side of the equation x,y It has nothing to do with the result , Therefore, we can find ：A=0, B=0, And ：

$C*n-D = n^2 \tag{1}$

Third principle ： The coordinate value of the center point of the far plane remains unchanged by (0,0,-f)

Also, in order to ensure that the matrix I 、 Two 、 The four elements are established , Here also need to use Homogeneous coordinate system The nature of (0,0,-f,1) It's written in (0,0,-f*f,f), We get the following equation ：

$\begin{bmatrix} 0 \\ 0 \\ -f^2 \\ f \end{bmatrix} = \begin{bmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & C & D \\ 0 & 0 & -1 & 0 \end{bmatrix} * \begin{bmatrix} 0 \\ 0 \\ -f \\ 1 \end{bmatrix}$

$C*f-D = f^2 \tag{2}$

union (1) (2) You can easily find

C = n + f

D = nf

thus , We finally successfully deduced $M_{persp->ortho}$ matrix ：

$M_{persp->ortho} = \begin{bmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & n+f & nf \\ 0 & 0 & -1 & 0 \end{bmatrix} \tag{n,f Is a positive case }$

Be careful ： If the above is set at the beginning n、f A negative value means ： $x_p = \frac{n \cdot x_e}{z_e}, y_p = \frac{n \cdot y_e}{z_e}$ , The following derivation is completely consistent with the above process , The final result is ：

$M_{persp->ortho} = \begin{bmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & n+f & -nf \\ 0 & 0 & 1 & 0 \end{bmatrix} \tag{n,f Is negative }$

At this point, the final form of the perspective projection transformation matrix can be obtained $M_{persp}=M_{ortho} * M_{persp->ortho}$

$M_{persp}= \begin{bmatrix} \frac{2}{r-l} & 0 & 0 & -\frac{r+l}{r-l} \\ 0 & \frac{2}{t-b} & 0 & -\frac{t+b}{t-b} \\ 0 & 0 & \frac{2}{n - f} & \frac{f+n}{n - f} \\ 0 & 0 & 0 & 1 \end{bmatrix} * \begin{bmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & n+f & nf \\ 0 & 0 & -1 & 0 \end{bmatrix} = \begin{bmatrix} \frac{2n}{r-l} & 0 & \frac{r+l}{r-l} & 0 \\ 0 & \frac{2n}{t-b} & \frac{t+b}{t-b} & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2fn}{f-n} \\ 0 & 0 & -1 & 0 \end{bmatrix} \tag{OpenGL Perspective projection matrix }$

$M_{persp}= \begin{bmatrix} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2fn}{f-n} \\ 0 & 0 & -1 & 0 \end{bmatrix} \tag{ simplify OpenGL Perspective projection matrix }$

Here by looking up GLM In the library glm::perspective Find out , The source code is completely consistent with the results we pushed .

// glm/gtc/matrix_transform.inl:222//  The storage method is column first Result[0][0] = (static_cast<T>(2) * nearVal) / (right - left);Result[1][1] = (static_cast<T>(2) * nearVal) / (top - bottom);Result[2][0] = (right + left) / (right - left);Result[2][1] = (top + bottom) / (top - bottom);Result[2][2] = -(farVal + nearVal) / (farVal - nearVal);Result[2][3] = static_cast<T>(-1);Result[3][2] = -(static_cast<T>(2) * farVal * nearVal) / (farVal - nearVal);

1.3 Perspective division and NDC Space

Description of symbolic terms ：

$x_c, y_c, z_c, w_c$ Express Clipping space （clip space） coordinate
$x_e, y_e, z_e, w_e$ Express Observe the space （eye\view space) coordinate
$x_n, y_n, z_n$ Express NDC Spatial coordinates

The relationship between cropping coordinate system and observation coordinate system ：

$\begin{bmatrix} x_{c} \\ y_{c} \\ z_{c} \\ w_c \end{bmatrix} = M_{persp} * \begin{bmatrix} x_{e} \\ y_{e} \\ z_{e} \\ w_{e} \end{bmatrix}$

According to the definition of homogeneous coordinates, we know (x,y,z,w) And (x/w,y/w,z/w,1) It is equivalent. , That is to say, the (x,y,z,w) Divide by w, This step we call Perspective division （Perspective Divide）.

$\begin{bmatrix} x_{n} \\ y_{n} \\ z_{n} \end{bmatrix} = \begin{bmatrix} x_{c} / w_{c} \\ y_{c} / w_{c} \\ z_{c} / w_{c} \end{bmatrix}$

From the above derivation, we can see that , The perspective projection transformation will compress the visual cone into a standard cube , So the apex of the visual vertebra (x/w,y/w,z/w,1), Its value range is ：

$\begin{cases} -1 < x/w < 1 => -w < x < w, \\ -1 < y/w < 1 => -w < y < w, \\ -1 < z/w < 1 => -w < z < w \end{cases}$

So any vertex in the clipping space (x,y,z,w) If not satisfied (-w<x<w && -w<y<w && -w<z<w) That is, it is not in the optic vertebra , Need to be cropped .

The remaining vertices after clipping , Let's do a perspective division , You can transform them from crop space to NDC Space .

1.4 Zfighting problem

From perspective projection matrix and 1.3 The coordinate transformation relationship in section ：

$z_n = z_c/w_c \\ w_c = -z_e \\ z_c = {-\frac{f+n}{f-n}z_e - \frac{2fn}{f-n}} \\ \\ z_n = z_c/w_c = \frac {-\frac{f+n}{f-n}z_e - \frac{2fn}{f-n}} {-z_e} = \frac{f+n}{f-n} + \frac{2fn}{f-n}/{z_e}$

From above $z_n$ The formula can draw the following coordinate mapping diagram , You can see , The value changes greatly near the near plane , Good accuracy ; And near the far plane , Within a certain distance , Almost flat , Poor accuracy .

When you increase the range of the near and far clipping planes , As shown in the right figure below , We can see that near the far plane , $z_e$ Coordinate values are projected to $z_n$ The values after the coordinates are almost the same , The phenomenon of low accuracy is more obvious , The problem caused by the accuracy of this depth is called zFighting.

So try to minimize [-n,-f] The scope of the , To lessen zFighting problem .

1.5 Use FOV Derive perspective projection

The other one is often used The way is through perspective (Fov), Aspect ratio (Aspect) To specify the perspective projection .

It specifies fovy Specify the perspective ,aspect Specify the aspect ratio ,zNear and zFar Specify the clipping plane .fovy As shown in the figure below ：

These parameters specify a symmetrical visible body , As shown in the figure below ：

By these parameters , Substitute the above derived OpenGL Perspective projection matrix ( Simplified edition ) You can get ：

$t = n * tan(fov/2) \\ \frac{r}{t} = \frac{w}{h} = aspect => r = aspect*t = aspect * n * tan(fov/2) \\$

By substituting the perspective transformation matrix above, we can get Fov Perspective projection matrix by ：

$P= \begin{bmatrix} \frac{cot(\frac{\theta}{2})}{aspect} & 0 & 0 & 0 \\ 0 & cot(\frac{\theta}{2}) & 0 & 0 \\ 0& 0 & \frac{-(f+n)}{f-n} & \frac{-2fn}{f-n} \\ 0 & 0 & -1 & 0 \end{bmatrix} \tag{Fov Perspective projection matrix }$

Two 、 Viewport transformation process derivation

Viewport transform Yes, it will NDC The process of converting coordinates to display screen coordinates , As shown in the figure below ：

$\begin{cases} -1 \leq x \leq 1 \\ -1 \leq y \leq 1 \\ 0 \leq z \leq 1 \end{cases}$

explain ：

$x_n,y_n, z_n$ Express NDC Space coordinate value
$x_s, y_s, z_s$ Represents the screen space coordinate value
$n_s, f_s$ Represents the near plane of screen space 、 The position of the far plane
$S_x, S_y$ Indicates the starting pixel of the screen
$W_s, H_s$ Indicates the pixel width of the screen 、 high

$\frac{x_n - (-1)}{2} = \frac{x_s - S_x}{W_s} => x_s = \frac{W_s}{2}x_n + \frac{W_s}{2} + S_x \\ \frac{y_n - (-1)}{2} = \frac{y_s - S_y}{H_s} => y_s = \frac{H_s}{2}y_n + \frac{H_s}{2} + S_y \\ \frac{z_n - (-1)}{2} = \frac{z_s - n_s}{f_s - n_s} => z_s = \frac{f_s - n_s}{2} + \frac{n_s + f_s}{2}$

Then the viewport transformation matrix obtained from the above formula is ：

$viewPort=\begin{bmatrix} \frac{W_{s}}{2} & 0 & 0 & S_{x}+\frac{W_{s}}{2} \\ 0 & \frac{H_{s}}{2} & 0 & S_{y}+\frac{H_{s}}{2} \\ 0 & 0 & \frac{f_{s}-n_{s}}{2} & \frac{n_{s} + f_{s}}{2} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{ Viewport transformation matrix }$

stay OpenGL in , In general $S_x=0, S_y=0, n_s=0, f_s=1$ , therefore viewPort Can be simplified as ：

$viewPort=\begin{bmatrix} \frac{W_{s}}{2} & 0 & 0 & \frac{W_{s}}{2} \\ 0 & \frac{H_{s}}{2} & 0 & \frac{H_{s}}{2} \\ 0 & 0 & \frac{1}{2} & \frac{1}{2} \\ 0 & 0 & 0 & 1 \end{bmatrix} \tag{ Simplify viewport transformation matrix }$

3、 ... and 、 The transformation relationship between linear and nonlinear depth

3.1 from NDC The value of spatial depth reverses the linear depth under the observation space

Known in 1.4 Section has been derived $z_n And z_e$ The transformation relationship between ：

$z_n = z_c/w_c = \frac {-\frac{f+n}{f-n}z_e - \frac{2fn}{f-n}} {-z_e} = \frac{f+n}{f-n} + \frac{2fn}{（f-n）z_e}$

Therefore, it is easy to reverse deduce ：

$z_e = \frac{-2fn}{f+n - z_n(f-n)}$

here $z_e$ Represents the observation space ( Right hand coordinate system ) Coordinate value under , It's a negative number ;

The concept of depth is aimed at screen space （ Left handed coordinate system ） To define the , Here to $z_e$ Take the opposite , Can be obtained from NDC Space anti push Observe the linear depth in space :

$linearDepth = -z_e = \frac{2nf}{z_n(f-n) - (f+n)}$

3.2 The nonlinear depth value in screen space is derived from the linear depth value in observation space

according to 5.2 The viewport transformation matrix introduced in can be easily obtained ：

z_s = 0.5z_n + 0.5 \\ Plug in z_n = z_c/w_c Available ： \\ \begin{align*} z_s &= \frac{1}{2}(z_n + 1) \\ &= \frac{1}{2}(\frac{f+n}{f-n} + \frac{2fn}{z_e(f-n)} + 1)\\ &= \frac{f - \frac{nf}{z_e}}{f - n} \\ &= \frac{\frac{1}{-z_e} - \frac{1}{n}}{\frac{1}{f} - \frac{1}{n}} \end{align*}

Same as above , here $z_e$ Represents the observation space ( Right hand coordinate system ) Coordinate value under

But in the depth test article , Defined z Yes, it represents near 、 Distance value between far planes , They're all positive numbers , Therefore, we should also be right here $z_e$ Take the opposite .

Finally, it can be deduced from the observation space Nonlinear depth value of screen space by ：

$F_{depth} = z_s = \frac{\frac{1}{z} - \frac{1}{n}}{\frac{1}{f} - \frac{1}{n}}$

3.3 Restore position and normal information through depth information

Reference material ：

https://zhuanlan.zhihu.com/p/367257314

https://mynameismjp.wordpress.com/2010/09/05/position-from-depth-3

The reasoning process of the above two materials is more detailed , No more details here .

Four 、 Description of normal transformation

Reference material ：https://blog.csdn.net/u012419410/article/details/42174839

It can be seen from the above two figures , When there are scale transformations in the model , The normal vector will be destroyed , So it's not easy to use $M_{view}*M_{model}$ Matrix to transform the normal vector .

The derivation process of normal transformation matrix is as follows ：

hypothesis Model space One of the tangent vectors in is T, The normal vector is N.

So if they are vertical, we can get ： $T^TN=0$

Suppose they switch to Eye space The middle and the rear are $T'$ and $N'$ . Then they should still be perpendicular to each other ： $T’^TN’=0$

Suppose the tangent vector T And normal vectors N The transformation matrix of is M、G. Then there are ： $(MT)^T(GN)=0$

Further launch ： $T^TM^TGN=0$

because $T^TN=0$ , So we guess $M^TG=I$

Finally get ： $G=(M^{-1})^T$

namely ： The transformation matrix applied to the normal vector is the inverse transpose matrix of the vertex transformation matrix .

原网站

版权声明
本文为[Hua Weiyun]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/161/202206101605115601.html