当前位置：网站首页>Game optimization performance (11) - Zhihu

Game optimization performance (11) - Zhihu

2020-11-08 08:54:00 【osc_eoqljui5】

Want to make APP Same thing as WeChat , Can run small programs smoothly ？ | Experience will send you to Xinjiang 、 Huawei 、 Cherry keyboard ！>>>

VS after , It's the rasterization stage . This stage is a fixed function （ Not programmable ） Stage , Usually considered to be highly efficient in execution , So it's often overlooked .

In fact, in terms of what I have observed , This part becomes the bottleneck situation , It's not uncommon . such as 《 Protogod 》 In the development process , That's what happened .

《 Protogod 》 So here's what happened , In the game, when characters climb trees , In order to avoid the canopy blocking the characters , There will be a translucent crown effect . Normal translucent rendering is a known performance killer , So here developers use stencil Cut out some pixels , It's called dither（ shake ） Methods . If you don't understand this method , Imagine the pictures in the newspaper , It's all made up of dots .

Logically speaking , This hollowing out reduces the number of pixels that need to be rendered , That is to say PS The amount of work . But the development team found , The end result is a rise instead of a fall . In other words, rendering time has increased . And even more incredible is , By comparing the switching effect of GPU Tracking files , It can be observed that PS The amount of work is definitely reduced , But the rendering time has not changed, or even slightly longer .

In fact, the reason lies in the grating .VS The output triangle , After the grating module is rasterized , formation PS workload . Before rasterization , Will proceed according to the triangle level on the back / Positive elimination 、 Cone culling / tailoring , And zero area / Small triangle culling . however , be based on stencil Test level exclusion , It doesn't happen at the triangle level , It happened after rasterization fragment Level . in other words ,dither Although it reduces access to PS Stage fragment Number , But it doesn't affect the work of rasterization .

But if it's just that , that dither After opening , It should be faster . Because rasterization has the same amount of work , however PS Reduced workload , It should be faster . But the measurement is slower , Why is that ？

This is because on the contemporary desktop GPU among , Introduced tile-based rasterization. Note that this is not a mobile platform TB（D）R, Because it's limited to rasterization Stage .

say concretely ,GPU The unwrapping will not be rendered as triangles at one time fragment, It's at a lower resolution , such as 1/8 Target resolution , To rasterize . such as , If our picture turns out to be 1920x1080, be GPU First of all, with 240x145 This resolution is rasterized , And then for each rasterization result （8x8 Pixels ） Further rasterization .（ The specific method and size are different GPU There may be significant differences in models ）

There is one advantage to this approach , It can be greatly improved pre Z as well as pre Stencil The efficiency of . If a unit of low resolution （tile） On the whole pre Z Test or pre Stencil Rejected during the test , So there's no need to rasterize it more finely .

And the situation in our case is , Its use Stencil Templates , That is to say “ Hollowing out ” The template of , The pattern of the hole is not aligned with this tile. in other words , When we use tile Do it for the unit pre Stencil When , Can't refuse forever （ because tile The mask values are different , Partly through partial rejection ）. In comparison, it doesn't open dither The situation of , It's like one more in vain stencil Testing, but the rasterization workload is not reduced at all , Instead, there is a query in the rasterization process stencil Steps for . So the efficiency of rasterization becomes lower .

版权声明
本文为[osc_eoqljui5]所创，转载请带上原文链接，感谢

当前位置：网站首页>Game optimization performance (11) - Zhihu

Game optimization performance (11) - Zhihu

边栏推荐

猜你喜欢

随机推荐