Dynamic Branching in Shader

Static branching vs. Dynamic branching:
  • Static one to switch on or off, based of a boolean shader constant, for disable or enable a code path。Between draw calls, you can decude which features you want to support, and set the Boolean flags to support this behavior。
  • Dynamic one the comparison condition resides in a variable, done for each vertex or each pixel at run time( not at compiling time or between two draw calls)。The performance hit is the cost of branch + the cost of the instructions on the side of the branch token。Implemented in shader model 3.0 or higher。

該討論串討論: 當GLSL在Nvidia的GPU時,遇到了雙層迴圈,執行8x8的braching運算,發生效能剩下2fps的問題。 當對於looping做unroll的動作,發現又可以回到30fps。
因此一開始猜想的內容是對於GPU的branching能力,後來發現arry index中使用了constant register index才是影響效能的關鍵,當將array的內容儲存在一個texture上,或是改用uniform array of vec3's都可以獲得改善。

此討論串:討論了使用if condition可能會造成執行了both branching的問題,因為compiler判斷的問題,此耗費的計算比手動切分shader code,讓flow趨向單一branching的計算相對來的多。
(result = cmp( condition, result_a, result_b ); compiler可能會用cmp的asm,而不是if - else)

能用static branch就別用dynamic branching,使用static branch需要將condition內使用boolean ,避免使用variable,否則會造成dyanmic branching。 如果用? : 就可以使用single instruction,而不會如if - else發生較多instruction。

當asm中未出現dyanmic branching的支援,反而使用cmp的instruction時,可以嘗試以下方法:
  • Texture using branch, have to use tex2Dlod rather than tex2D
  • D3DXSHADER_PREFER_FLOW_CONTROL when you call D3DXCompileShader... and of course specify ps_3_0


Popular posts from this blog

tex2D vs. tex2Dproj

Fast subsurface scattering

Physically-Based Rendering in WebGL