Showing posts from April, 2010


過去一年中階人員平均最高薪的前五名是數學家、精算師、軟體工程師、電腦系統分析師、統計學家,他們平均年薪可以在七十三萬美元以上。不過要進這幾行,優異的數學能力將 是必備。《華爾街日報》特別分析,在金融業紛紛裁員減薪時,數學家、精算師反而不怕。因為此 刻企業需要他們的精準的評估損失、計算風險報酬,反而穩坐高薪一族。有好的數學頭腦進入金融、軟體、工程等領域,是未來要拿到高薪的保證。


今天老大說我們做的產品在北美已經通過發行認證了,所以中午請我們吃飯,吃的好飽喔~~這個產品其實我才幫忙兩個禮拜而已,個人貢獻度有限,有這樣的結果 多虧了同事們的努力,真是辛苦了!!

Shader use register constant array index

Today I wonder if using register constant array index will influence performance. I design my experiment is a three point light shading shader:
Lighting color, intensity, and position all them use three constant floating to set into shader separately.
Use color, intensity, and position array, their size is three for shader computing.In the experiment result, they all have the same frame rate are 2725. So in my experiment, using register constant array index doesn't has performance penalty.

The experiment of shader programming with if-condition instruction

Today I make an experiment to talk about the if-else syntax in shader codes, how this way influences my application performance.
The first one has three shaders are one point light, two point lights, and three point lights. We apply them to three different surface model.The second one, we use a shader code which writes three if condition instructions to show three types light which also be applied to three different surface model(same with upper experiment).
Finally, the experiment results show us the first one has 2639 FPS, the other is 1762 FPS. Although the second one use the same code, we can avoid reset shader to device. It tells us the if instruction will effect our performance, and it hurts degree more than reset shader to device. So all we can do is avoid to use if instruction possibly.


最近在研究如何規劃lighting系統,和實作shadow map的演算法。剛好這本書都有提到我想探知的topic。如果有誰有購買的可以借我翻一翻嗎?...最近的我有點窮

Shadow map develope (I)

Shadow map can help us generate full-screen shadow result. It can achieve natural self-shadow effect。But its bottleneck is hard shadow and resolution problem。
First, I want to use this dx9 sample to describe shadow map implement, shadow map is a multiple pass technique, so I divide into different term to decribe shadow map。
The first pass, we need to generate the shadow map. Store the current render target and DepthStencil, then set render target to the shadow map texture, then tranform view space to lighting's view space(including projection space), write the depth value to the render target, finally resort previous render target and DepthStencil。The End pass, render the scene. Use camera's view space and projection space, set the shadow map texture, generate the shadow map martix that means transform from view space to lighting space. And in the pixel shader, we can at shadow map space to sink our shadow map texel, and using pcf(percentage closest filtering) to blur…


NVemulate allows you to emulate the functionality of various GPUs (very slowly) in software. In addition, you can use it to control GLSL Support and Open GL 3.0 Support.

Dynamic Branching in Shader

Static branching vs. Dynamic branching:
Static one to switch on or off, based of a boolean shader constant, for disable or enable a code path。Between draw calls, you can decude which features you want to support, and set the Boolean flags to support this behavior。Dynamic one the comparison condition resides in a variable, done for each vertex or each pixel at run time( not at compiling time or between two draw calls)。The performance hit is the cost of branch + the cost of the instructions on the side of the branch token。Implemented in shader model 3.0 or higher。

該討論串討論: 當GLSL在Nvidia的GPU時,遇到了雙層迴圈,執行8x8的braching運算,發生效能剩下2fps的問題。 當對於looping做unroll的動作,發現又可以回到30fps。
因此一開始猜想的內容是對於GPU的branching能力,後來發現arry index中使用了constant register index才是影響效能的關鍵,當將array的內容儲存在一個texture上,或是改用uniform array of vec3's都可以獲得改善。

此討論串:討論了使用if condition可能會造成執行了both branching的問題,因為compiler判斷…

The first article