当前位置:网站首页>Unity:Resource Merging、Static Batching、Dynamic Batching、GPU Instancing
Unity:Resource Merging、Static Batching、Dynamic Batching、GPU Instancing
2022-07-27 04:41:00 【qq_ forty-two million nine hundred and eighty-seven thousand ni】
The following summary may be wrong , I would also like to point out that .
One 、 Introduction of prerequisite knowledge
These prerequisites are very important , I have understood for a long time .
This part refers to :
Graphics rendering and optimization —Batch- Tencent game school
Unity Optimize ----drawcall series - Let's have a look
Batch, Draw Call, Setpass Call - You know
Unity - DrawCall, Batch, SetPassCall difference _Jave.Lin The blog of -CSDN Blog
GPU Architecture and rendering - You know
1.Draw call And Batch
1) Definition :Draw call yes CPU Call the graphic interface to draw the corresponding things on the screen .( Focus on interface calls )
Draw call Strictly speaking, it refers to calling a rendering API Drawing interface for ( Such as :Direct3D Of DrawPrimitive/DrawIndexedPrimitive,OpenGL Of glDrawArrays/glDrawElements) All count as one time Draw Call, But for Unity for , It can have more Draw Call Merge into one Batch To render .
and Batch Translated into Chinese, we generally call it “ batch ”. We often use the number of batches submitted by the engine per frame as an indicator of rendering pressure . towards GPU The behavior of submitting a certain number of triangles using the same render state is a render batch . from API From the perspective of call ,Batch and Draw call It is equivalent. , But in the game engine, their practical significance is different :Batch Generally refers to the packed Draw call.
2)Command Buffer:
CPU and GPU Data transmission between is an asynchronous process , Similar to data transmission between server and client .CPU and GPU Constructed a kind of producer / Consumer asynchronous processing model .CPU production “ command ”,GPU consumption “ command ”, Through this relationship CPU Data and behavior can be transferred to GPU,GPU To perform the corresponding action .
and Command Buffer Can let CPU and GPU Able to work in parallel . It also has a very important role : Improve rendering efficiency . Say why it will affect efficiency , First of all, we need to understand his working principle : in order to CPU and GPU Can work in parallel , You need a command buffer , Is the CPU Add a command to it , And then GPU Read the command from , This is achieved through CPU Prepare the data , notice GPU Rendering .
The following figure shows the rendering instructions in CPU and GPU The flow process in .

( Main performance consumption ) Command from the Runtime To Driver In the process of ,CPU To switch from user mode to kernel mode . Mode switching is for CPU It's a very time-consuming job , So if all API call Runtime Send rendering commands directly to Driver, That will lead to every API All calls occur CPU Mode switch , This performance consumption is very large .Runtime Medium Command Buffer You can send some unnecessary information to Driver Buffer your commands , Send it to... Together at the right time Driver, And then in Video Card perform . In this way to seek the least CPU Mode switch , Improve efficiency .
3) take Draw Call String together
On each call Draw Call Before ,CPU You need to GPU Send a lot of content , Mainly including data , Render states ( It is to set the material and texture required by the object ), Orders, etc .CPU The specific operation is :
a. Prepare to render the object , Then load the rendered object from the hard disk into memory , Then load from memory to video memory , And then convenient GPU High speed processing ( call draw call To prepare )
b. Set the rendering state of each object , That is to set the material of the object 、 texture 、 Shaders, etc ( call draw call To prepare , namely setpass call)
c. Output rendered entities , And then to GPU send out DrawCall command , And pass the rendered elements to GPU( Really called draw call Interface )
So if DrawCall Too much will lead to CPU Do a lot of calculations , Leading to CPU Overload of , Affect the efficiency of the game .
2.SetPass Call
Set Pass Call Represents rendering state switching , It mainly occurs when the materials are inconsistent , Switch rendering state . We know one batch Include , Submit vbo, Submit ibo, Submit shader, Set the hardware rendering state , Set the light source properties ( Note that submission textures are not strictly included in a batch Inside , Textures can be cached and multiplexed in multiple frames ).
The simplest understanding SetPassCall : Draw this Pass front , All state configurations that need to be set 、 or BUFFER Set up , All of them SetPassCall The content of , Or call :SetGPUDataBeforeDraw Will be more suitable for understanding ( Set... Before painting GPU data , These data include rendering systems , Such as :DX or OpenGL The state of the value , or Buffer data )
therefore Unity More than a :SetPassCall,SetPassCall = SetStateBeforeDraw
give an example : If one batch And another one. batch The same material or different materials are not used pass, Then it will trigger once set pass call To reset the rendering state . for example ,Unity To render 20 An object , this 20 Objects use the same material ( But not necessarily mesh Equivalent ), Suppose twice dynamic batch Each of them approved 10 An object , For this rendering ,set pass call by 1( You only need to render one material ),batch by 2( towards GPU Submitted twice VBO,IBO Data such as ).
3. Summary
Where the real cost is large , The first is to switch the rendering state , The second is to sort out and submit data . In the process of real practice , Don't mind too much Draw call This number ( Because no data is submitted or the rendering state is switched , Actually, a few more draw call It doesn't matter ), however Set Pass Call and Batch Both numbers should be reduced . Because there is a strong correlation between the two , Then usually lower one , You can reduce the second .
in general :Set pass call The value of is low ,Draw call Not necessarily low , however SetPass call Flowers with high values ,Draw call It must be high . Our optimization usually starts from reduction Set pass call The number and reduction of Draw Call To optimize the quantity .
Batch rendering is achieved by reducing CPU towards GPU Send render commands (DrawCall) The number of times , And reduce GPU Number of times to switch render States , Try to make GPU Do more things at once , To improve the overall efficiency of logic lines and rendering lines . But this is based on GPU Relatively idle , and CPU Spend more time on rendering command submission , It makes sense .
Two 、Resource Merging
This part refers to :Unity Use UGUI Make atlas _ Draw a small round blog -CSDN Blog _ugui Generating Atlas
Unity Introduction and use of atlas _z2014z The blog of -CSDN Blog _unity atlas
1. principle
It's mainly about atlas merging , If some models reference materials other than Texture Other parameters are the same , We can reference these models Texture A merger , Combine them into a larger Texture. In this way, all models can reference the same material , Then by setting the rendering state only once . although Draw call The number has not decreased , But it avoids the switching of rendering state , It also achieves the purpose of batch rendering .
In essence, it should be a typical example of dynamic batch .
2. matters needing attention
1) Pay attention to controlling the size of the atlas , Don't make the atlas too large , A super large atlas DrawCall Consumption may be the consumption of more than a dozen small atlas on the top
2) Try to be as compact as possible , Don't be bigger than 512x512
3)Draw Call As far as possible , Small drawings of the same interface shall be in one atlas as far as possible
4) Memory management is convenient , Good loading performance , When opening an interface, only the necessary atlas is loaded , The atlas can be easily released when closed
5)AssetBundle pack 、 Heat is more reasonable , Cannot appear “ Heat a new interface , A large number of atlas need to be hotter ” The situation of
6) Design UI Consider reusability , Put the border 、 Buttons and other shared resources , Put it in 1~3 A large collection , As a reusable atlas ;
7) Other non reusable UI Divide according to functional modules , Each module uses 1~2 Atlas , As a functional atlas ;
8) For parts UI, If both function atlas and reuse atlas are used , But the rest of its functional atlas “ vacancy ” More , Then the elements used in the reuse graph set can be picked out separately , Close into the function diagram set , Give Way UI It only depends on the function atlas . Through certain data redundancy , To improve performance ;
3、 ... and 、Static Batching
Reference from :
Batch, Draw Call, Setpass Call - You know
【Unity Game development 】 static state 、 Dynamic batch and GPU Instancing - You know
1. principle :
Static approval is checked Static,Unity stay Build When , The merged mesh will be generated automatically , And store the merged data in the form of file , So when the scene is loaded , Submit the entire vertex data once , Judge the visibility of each sub model according to the scene management system of the engine . Then set the render state once , Call several times Draw call Draw each sub model separately .
Static Batching Assemble static objects into a large size vbo Submit , But submit it only for the object to be rendered IBO. It's not without cost . for instance , Four objects need static batch merging, and each vertex of the first three objects only needs position , First set uv Coordinate information , Normal information , And the fourth object in addition to the above information , There is also more tangent information , Then this VBO It will include all four sets of information at each vertex , No doubt combine this VBO It's about being right CPU And video memory have extra overhead .
2. requirement
Ask every time Static Batching Use the same material, But yes. mesh The same is not required . And emphasize manual setting .
3. Performance analysis
Static batch Does not reduce Draw call The number of ( But in the editor, due to different calculation methods Draw call The number will show a decrease ), But because we transform the vertices of all sub models into world space in advance , And these sub models share materials , So many times Draw call Switching between calls with theout rendering state is reduced Set Pass Call , Rendering API Will cache the drawing command , It plays the purpose of rendering optimization ( The feeling can be understood as multiple Draw Share one Set Pass Call). in addition , At run time, all vertex position processing does not need to be calculated , It saves computing resources .
About whether to reduce Draw Call The quantity is in doubt , Please refer to the comments section below : About static batch processing / Dynamic batch processing /GPU Instancing /SRP Batcher A detailed analysis of - You know
It seems to be in the newer Unity Static batch reduction has been achieved in the version Draw Call 了 .
Four 、Dynamic Batching
This part refers to :Unity Basic knowledge review ( One )DrawCall、Batch、SetPassCall The difference between _ Big whale pot blog -CSDN Blog
1. principle :
Dynamic batch is designed to optimize the dynamic process of sharing the same material in the scene GameObject Rendering design . The goal is to merge small mesh models with minimal cost , Reduce Drawcall.
The principle of dynamic batch is also very simple , Before scene painting, transform the vertex information of all models sharing the same material into world space , And then through one Draw call Draw multiple models , Achieve the purpose of joint approval . The operation of model vertex transformation is by CPU Accomplished , So this will bring some CPU Performance consumption of .
2. requirement
a. The same material must be used , And there can only be one material pass; The use of multiple pass Of Shader It will never be approved . because Multi-pass Shader It usually causes an object to be drawn several times in succession , And switch the rendering state . This will break its connection with other objects Dynamic batching The opportunity of .
b.mesh There can only be 900 Vertex data ; If we use vertex coordinates , normal ,UV, Then you can only 300 vertices ; If we use UV0,UV1, And tangent , Less , Only up to 150 vertices .
c.transform Of scale Attribute cannot be negative ; If two models Zoom size Different , Can't be approved , That is, the scaling between models must be consistent .
d. Dynamic batch processing is off by default , Manual opening required :Project Setting—Player— On the hook Dynamic Batching;
e. If they have Lightmap data , It has to be the same Only in this way can we have a chance to get approval .
f. Delayed rendering cannot be batched
3. Performance analysis
Set Pass Call and draw Call Have been reduced .
5、 ... and 、GPU Instancing
1. principle
To sum up simply : Pass an object Mesh, Specify its painting times and material ,Unity Will be for us in GPU Of Unified / Constant buffer Open up the necessary buffer , Then use the material we assigned to Mesh Render as many times as we specify , In this way, it can be achieved once Drawcall、 once Set Pass Call The purpose of drawing massive objects .
about Unity Of GPU Instance Come on , From the perspective of data processing, it can also be divided into two categories :
- The first is to use
Supported and enabled GPU Instance Of ShaderWhen rendering objects with materials of ( For example, we passed Gameobject.Instantiate Instantiate 100w A cube ),Unity All rendered objects will be specially treated , For all render targets in GPU Constant buffer for (Constant Buffer in ) Prepare various buffers ( Vertex data buffer , Material data buffer ,transform Matrix data buffer, etc ) - The second is that we call GPU Instance API Carry out instance drawing , that Unity Only vertex buffers will be prepared for them according to the parameters we pass , Material data buffer , Matrix data buffer or other customized data are not provided , We need to pass by ourselves ComputeBuffer To transmit these data , And then in Shader According to the instanceId To deal with . For example, we use GPU Instance API draw 100w Triangles , that Unity Can control GPU The back end is prepared for us to accommodate 300w A buffer of vertices and a buffer of material data . This part can be referred to :U3D Optimize batch processing -GPU Instancing Get to know - You know
2. requirement
Although shaders are the same, shader properties can be different .
3. Performance analysis
1) Traditional rendering ( No case of approval ): How many objects to draw, how many times to collate and transfer data , The process of sorting and transferring data consumes a lot , Most of them are performance bottlenecks
2)GPU Instance: Only from CPU Go to GPU Pass data once , Put some of the original in CPU The operation of is transferred to GPU Go to of , Greatly improves rendering efficiency .
4. Compare with the priority of the batch
1)Static batching Priority is higher than GPU Instancing High priority , If one GameObject Marked as static Objects and in Build Stage successfully implemented static batch , So if this object still needs to be used Instancing Shader Exaggerated words ,Instancing It will fail .
2)Dynamic batching Has a lower priority than Instancing. If one GameObject Use Instancing Exaggerated words , So for this Dynamic batching It will fail .
边栏推荐
- 【动态规划百题强化计划】11~20(持续更新中)
- Scala immutable map, variable map, map conversion to other data types
- Some common instructions in JVM tuning
- 结构型模式-桥接模式
- 第4章 Bean对象的作用域以及生命周期
- Using JSON type to realize array function in MySQL
- BSN IPFS(星际文件系统)专网简介、功能、架构及特性、接入说明
- ISG index shows that the it and business service market in the Asia Pacific region fell sharply in the second quarter
- 【AtCoder Beginner Contest 260 (A·B·C)】
- 无有线网络下安装并配置debian
猜你喜欢

Use the kubesphere graphical interface dashboard to enable the Devops function

Dry goods | how can independent station operation improve online chat customer service?

BSN IPFs (interstellar file system) private network introduction, functions, architecture and characteristics, access instructions

2022杭电多校联赛第三场 题解

Redis interview question (2022)

IIC 通信协议 (一)

IP 14th day notes

There are two solutions for the feign call header of microservices to be discarded (with source code)

People don't talk much, engineers don't talk much

佳明手表怎么设置用户定制显示
随机推荐
From scratch, C language intensive Lecture 4: array
Unity:Resource Merging、Static Batching、Dynamic Batching、GPU Instancing
【独立站建设】跨境电商出海开网店,首选这个网站建设!
P1438 boring sequence line segment tree + difference
结构型模式-装饰者模式
负数的右移
[final review of software engineering] knowledge points + detailed explanation of major problems (E-R diagram, data flow diagram, N-S box diagram, state diagram, activity diagram, use case diagram...)
What is the difference between using varchar type and using date type for timestamp column?
Ribbon load balancing principle and some source codes
哈希表刷题(下)
使用Unity做一个艺术字系统
grid布局
Yolov4网络详解
Effect Hook
Standard C language 11
干货 | 独立站运营怎么提高在线聊天客户服务?
华为入局商用市场:趋势使然,挑战颇多
Network knowledge corner | it only takes four steps to teach you to use SecureCRT to connect to ENSP. You must see the operation guide of common tools
数据分析师岗位分析
Post analysis of Data Analyst