当前位置:网站首页>Game optimization performance (11) - Zhihu
Game optimization performance (11) - Zhihu
2020-11-08 08:54:00 【osc_eoqljui5】
VS after , It's the rasterization stage . This stage is a fixed function ( Not programmable ) Stage , Usually considered to be highly efficient in execution , So it's often overlooked .
In fact, in terms of what I have observed , This part becomes the bottleneck situation , It's not uncommon . such as 《 Protogod 》 In the development process , That's what happened .
《 Protogod 》 So here's what happened , In the game, when characters climb trees , In order to avoid the canopy blocking the characters , There will be a translucent crown effect . Normal translucent rendering is a known performance killer , So here developers use stencil Cut out some pixels , It's called dither( shake ) Methods . If you don't understand this method , Imagine the pictures in the newspaper , It's all made up of dots .
Logically speaking , This hollowing out reduces the number of pixels that need to be rendered , That is to say PS The amount of work . But the development team found , The end result is a rise instead of a fall . In other words, rendering time has increased . And even more incredible is , By comparing the switching effect of GPU Tracking files , It can be observed that PS The amount of work is definitely reduced , But the rendering time has not changed, or even slightly longer .
In fact, the reason lies in the grating .VS The output triangle , After the grating module is rasterized , formation PS workload . Before rasterization , Will proceed according to the triangle level on the back / Positive elimination 、 Cone culling / tailoring , And zero area / Small triangle culling . however , be based on stencil Test level exclusion , It doesn't happen at the triangle level , It happened after rasterization fragment Level . in other words ,dither Although it reduces access to PS Stage fragment Number , But it doesn't affect the work of rasterization .
But if it's just that , that dither After opening , It should be faster . Because rasterization has the same amount of work , however PS Reduced workload , It should be faster . But the measurement is slower , Why is that ?
This is because on the contemporary desktop GPU among , Introduced tile-based rasterization. Note that this is not a mobile platform TB(D)R, Because it's limited to rasterization Stage .
say concretely ,GPU The unwrapping will not be rendered as triangles at one time fragment, It's at a lower resolution , such as 1/8 Target resolution , To rasterize . such as , If our picture turns out to be 1920x1080, be GPU First of all, with 240x145 This resolution is rasterized , And then for each rasterization result (8x8 Pixels ) Further rasterization .( The specific method and size are different GPU There may be significant differences in models )
There is one advantage to this approach , It can be greatly improved pre Z as well as pre Stencil The efficiency of . If a unit of low resolution (tile) On the whole pre Z Test or pre Stencil Rejected during the test , So there's no need to rasterize it more finely .
And the situation in our case is , Its use Stencil Templates , That is to say “ Hollowing out ” The template of , The pattern of the hole is not aligned with this tile. in other words , When we use tile Do it for the unit pre Stencil When , Can't refuse forever ( because tile The mask values are different , Partly through partial rejection ). In comparison, it doesn't open dither The situation of , It's like one more in vain stencil Testing, but the rasterization workload is not reduced at all , Instead, there is a query in the rasterization process stencil Steps for . So the efficiency of rasterization becomes lower .
版权声明
本文为[osc_eoqljui5]所创,转载请带上原文链接,感谢
边栏推荐
- ulab 1.0.0发布
- 双向LSTM在时间序列异常值检测的应用
- What details does C + + improve on the basis of C
- NOIP 2012 提高组 复赛 第一天 第二题 国王游戏 game 数学推导 AC代码(高精度 低精度 乘 除 比较)+60代码(long long)+20分代码(全排列+深搜dfs)
- 蓝牙2.4G产品日本MIC认证的测试要求
- 技术人员该如何接手一个复杂的系统?
- Codeforce算法题 | 你能想出解法,让你的基友少氪金吗?
- Fgagt: flow guided adaptive graph tracking
- IOS upload app store error: this action cannot be completed - 22421 solution
- Template linked list learning
猜你喜欢
IOS learning note 2 [problems and solutions encountered during the installation and use of cocopods] [update 20160725]
python_ scrapy_ Fang Tianxia
Oschina plays on Sunday - before that, I always thought I was a
PX4添加新的应用
Do you really understand the high concurrency?
Mate 40 series launch with Huawei sports health service to bring healthy digital life
Brief history of computer
Test requirements for MIC certification of Bluetooth 2.4G products in Japan
Swiper window width changes, page width height changes lead to automatic sliding solution
C++在C的基础上改进了哪些细节
随机推荐
c# 表达式树(一)
C / C + + Programming Notes: what are the advantages of C compared with other programming languages?
M 端软件产品设计思虑札记 - 知乎
计算机网络基本概念(五)局域网基本原理
FORTRAN 77 reads some data from the file and uses the heron iteration formula to solve the problem
IOS upload app store error: this action cannot be completed - 22421 solution
Codeforce算法题 | 你能想出解法,让你的基友少氪金吗?
vivoy73s和荣耀30青春版的区别
阅读心得:FGAGT: Flow-Guided Adaptive Graph Tracking
Unparseable date: 'Mon Aug 15 11:24:39 CST 2016',时间格式转换异常
How does spotify drive data-driven decision making?
Littlest JupyterHub| 02 使用nbgitpuller分发共享文件
2020天翼智能生态博览会中国电信宣布5G SA正式规模商用
QT hybrid Python development technology: Python introduction, hybrid process and demo
i5 1135g7和i5 1035g1参数对比区别大吗? 哪个好
来自不同行业领域的50多个对象检测数据集
nvm
golang 匿名结构体成员,具名结构体成员,继承,组合
5g + Ar out of the circle, China Mobile Migu becomes the whole process strategic partner of the 33rd China Film Golden Rooster Award
Python loop distinction (while loop and for loop)