2018年4月27日 星期五

Fast subsurface scattering

Fig.1 - Fast Subsurface scattering of Stanford Bunny

Based on the implementation of three.js. It provides a cheap, fast, and convincing approach to do ray-tracing in translucent surfaces. It refers the sharing in GDC 2011 [1], and the approach is used by Frostbite 2 and Unity engines [1][2][3]. Traditionally, when a ray intersects with surfaces, it needs to calculate the bouncing result after intersections. Materials can be divided into three types roughly. Opaque, lights can't go through its geometry and the ray will be bounced back. Transparency, the ray passes and allow it through the surface totally, it probably would loose a little energy after leaving. Translucency, the ray after entering the surface will be bounced internally like below Fig. 2.

Fig.2 - BSSRDF [1]

In the case of translucency, we have several subsurface scattering approaches to solve our problem. When a light is traveling inside the shape, that needs to consider the diffuse value influence according the varying thickness of objects. As the Fig. 3 below, when a light leaving a surface, it generates diffusion and has attenuation based on the thickness of the shapes.

Fig.3 - Translucent lighting [1]

Thus, we need to have a way to determine the thickness inside surfaces. The most direct way is calculating  ambient occlusion to get its local thickness into a thickness map. The thickness map as below Fig.4 can be easy to generate from DCC tools.

Fig.4 - Local thickness map of Stanford Bunny

Then, we can start to implement our approximate subsurface scattering approach.

void Subsurface_Scattering(const in IncidentLight directLight, const in vec2 uv, const in vec3 geometryViewDir, const in vec3 geometryNormal, inout vec3 directDiffuse) {
  vec3 thickness = thicknessColor * texture2D(thicknessMap, uv).r;
  vec3 scatteringHalf = normalize(directLight.direction + (geometryNormal * thicknessDistortion));
  float scatteringDot = pow(saturate(dot(geometryViewDir, -scatteringHalf)), thicknessPower) * thicknessScale;
  vec3 scatteringIllu = (scatteringDot + thicknessAmbient) * thickness;
  directDiffuse += scatteringIllu * thicknessAttenuation * directLight.color;
}

The tricky part of the exit light is its direction is opposite to the incident light.  Therefore,  we get the light attenuation with  dot(geometryViewDir, -scatteringHalf) as its attenuation. Besides, We have several parameters that can be discussed detailed.

thicknessAmbient
- Ambient light value
- Visible from all angles even at the back side of surfaces

thicknessPower
- Power value of direct translucency
- View independent

thicknessDistortion
- Subsurface distortion
- Shift the surface normal
- View dependent

thicknessMap
- Pre-computed local thickness map
- Attenuates the back diffuse color with the local thickness map
- Can be utilized for both of direct and indirect lights

Because the local thickness map is precomputed, it doesn't work for animated/morph objects and concave objects. The alternative way is via real-time ambient occlusion map and inverting its normal or doing real-time thickness map.


Reference:
[1] GDC 2011 – Approximating Translucency for a Fast, Cheap and Convincing Subsurface Scattering Look, https://colinbarrebrisebois.com/2011/03/07/gdc-2011-approximating-translucency-for-a-fast-cheap-and-convincing-subsurface-scattering-look/
[2] Fast Subsurface Scattering in Unity Part 1,  https://www.alanzucconi.com/2017/08/30/fast-subsurface-scattering-1/
[3] Fast Subsurface Scattering in Unity Part 2,  https://www.alanzucconi.com/2017/08/30/fast-subsurface-scattering-2/

2018年2月22日 星期四

Physically-Based Rendering in WebGL

According to the image from Physically Based Shading At Disney as below, the left is the real chrome, the middle is PBR approach, and the right is Blinn-Phong. We can find PBR is more closer to the real case, and the difference part is the specular lighting part.


Blinn-Phong

The most important part of specular term in Blinn-Phong is it uses half-vector instead of using dot(lightDir, normalDir) to avoid the traditional Phong lighting model hard shape problem.

vec3 BRDF_Specular_BlinnPhong( vec3 lightDir, vec3 viewDir, vec3 normal, vec3 specularColor, float shininess ) {
  vec3 halfDir = normalize( lightDir + viewDir );
  float dotNH = saturate( dot( normal, halfDir ) );
  float dotLH = saturate( dot( lightDir, halfDir ) );
  vec3 F = F_Schlick( specularColor, dotLH );
  float G = G_BlinnPhong_Implicit( );
  float D = D_BlinnPhong( shininess, dotNH );
  return F * ( G * D );
}

Physically-Based rendering

Regarding to the lighting model of GGX, UE4 Shading presentation by Brian Karis, it takes the Cook-Torrance separation of terms as three factors:

D) GGX Distribution
F) Schlick-Fresnel
V) Schlick approximation of Smith solved with GGX

float G1V(float dotNV, float k) {
  return 1.0 / (dotNV * (1.0 - k) + k);
}

float BRDF_Specular_GGX(vec3 N, vec3 V, vec3 L, float roughness, float f0) {
  float alpha = roughness * roughness;
  float H = normalize(V+L);

  float dotNL = saturate(dot(N, L));
  float dotNV = saturate(dot(N, V));
  float dotNH = saturate(dot(N, H));
  float dotLH = saturate(dot(L, H));

  float F, D, vis;

  // D
  float alphaSqr = alpha * alpha;
  float pi = 3.14159;
  float denom = dotNH * dotNH * (alphaSqr - 1.0) + 1.0;
  D = alphaSqr / (pi * denom * denom);

  // F
  float dotLH5 = pow(1.0 - dotLH, 5);
  F = f0 + (1.0 - f0) * (dotLH5);

  // V
  float k = alpha / 2.0;
  vis = G1V(dotNL, k) * G1V(dotNL, k);

  float specular = dotNL * D * F * vis;
  return specular;
}

Unreal engine utilizes an approximate approach from Physically Based Shading on Mobile. We can see the specular term is shorten for the performance of mobile platform. (three.js' Standard material adopts this approach as well)

half3 EnvBRDFApprox( half3 SpecularColor, half Roughness,half NoV )
{
  const half4 c0 = { -1, -0.0275, -0.572, 0.022 };
  const half4 c1 = { 1, 0.0425, 1.04, -0.04 };
  half4 r = Roughness * c0 + c1;
  half a004 = min( r.x * r.x, exp2( -9.28 * NoV ) ) * r.x + r.y;
  half2 AB = half2( -1.04, 1.04 ) * a004 + r.zw;
  return SpecularColor * AB.x + AB.y;
}

Result:

http://daoshengmu.github.io/dsmu/pbr/webgl_materials_pbr.html


Reference:
[1] GGX Shading Model For Metallic Reflections,  http://www.neilblevins.com/cg_education/ggx/ggx.htm
[2] Optimizing GGX Shaders with dot(L,H), http://filmicworlds.com/blog/optimizing-ggx-shaders-with-dotlh/
[3] Physically Based Shading in Call of Duty: Black Ops, http://blog.selfshadow.com/publications/s2013-shading-course/lazarov/s2013_pbs_black_ops_2_notes.pdf

2017年7月13日 星期四

Setup TensorFlow with GPU support on Windows

TensorFlow with GPU support brings higher speed for computation than CPU-only. But, you need some additional settings especially for CUDA. First of all, we need to follow the guideline from https://www.tensorflow.org/install/install_windows. TensorFlow on Windows currently only has Python 3 support, I suggest to use python 3.5.3 or below. Then, install CUDA 8.0 and download cuDNN v6.0.

Then, move the files from cuDNN v6.0 that you already download to the path where you installed CUDA 8.0, like "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0", following the steps as below:

cudnn-8.0-windows10-x64-v6.0\cuda\bin\cudnn64_6.dll ---------------- CUDA\v8.0\bin
cudnn-8.0-windows10-x64-v6.0\cuda\include\cudnn.h ---------------------- CUDA\v8.0\include
cudnn-8.0-windows10-x64-v6.0\cuda\lib\x64\cudnn.lib --------------------- CUDA\v8.0\lib\x64

Don't need to add the folder path of cudnn-8.0-windows10-x64-v6.0 to your %PATH%. Now, we can start to confirm our installation is ready.

Steps:
1. Create a virtualenv under your working folder:
virtualenv --system-site-packages tensorflow
2. Activate it
tensorflow\Scripts\activate
It shows (tensorflow)$
3. Install TensorFlow with GPU support
pip3 install --upgrade tensorflow-gpu
4. Import TensorFlow to confirm it is ready
(tensorflow) %YOUR_PATH%\tensorflow>python
Python 3.5.2 [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
>>>

If it doesn't show anything, that means it works.
But, if you see error messages like, No module named '_pywrap_tensorflow_internal', you can take a look at issue 9469, 7705. It should be the cudnn version problem or cudnn can't be found. Please follow the method that I mentioned above.

 
 
 

2016年8月20日 星期六

Webrender 1.0

Source code: https://github.com/servo/webrender

2016年8月18日 星期四

AR on the Web

Because of the presence of Pokémon Go, lots of people start to discuss the possibility of AR (augmented reality) on the Web. Thanks for Jerome Etienne's slides, it brings me some idea to make this AR demo.

First of all, it is based on three.js and js-aruco. three.js is a WebGL framework that helps us construct and load 3D models. js-aruco is Javascript version of ArUco that is a minimal library for Augmented Reality applications based on OpenCV. These two project make it is possible to implement a Web AR proof of concept.

Then, I would like to introduce how to implement this demo. First, we need to use navigator.getUserMedia to give us the video stream from our webcam. This function is not supported on all browser vendors. Please take a look at the status.


navigator.getUserMedia = ( navigator.getUserMedia ||
                       navigator.webkitGetUserMedia ||
                       navigator.mozGetUserMedia ||
                       navigator.msGetUserMedia);

if (navigator.getUserMedia) {
    navigator.getUserMedia( { 'video': true }, gotStream, noStream);
}

The above code shows us how to get media stream in Javascript. In this demo, I just need video, and it will be sent to gotStream callback function. In gotSteam function, I give the stream to my video element that will be displayed on screen. And then, go to setupAR module. In setupAR(), I have to initialize my AR module, and setup my model and scene scale. Furthermore, I just need to wait the new videoStream coming and get my AR detect result from js-aruco at updateVideoStream() function.

In updateVideoStream(), like the above picture, it draws the current videoStream to an imageData that is maintained by a Canvas2D. Go on, the imageData is sent to arDetector to investigate if there is any marker on it. It will return a marker array that contains markers are detected from this imageData. Every marker owns the corners (x, y) coordinate of a marker. We can use these corner coordinates to do lots of applications. In my demo, I draw the corners and the marker id on it. The most interesting part is we can leverage markers to update the pose of a 3D model.

POS.Posit gives us a library to assist us get the transformation pose from the corners. In a pose, it contains a rotation matrix and a translation vector in a 3D space. Therefore, it is very easy for us to show a 3D model on markers except we need to do some coordinate conversion. First, we need to keep in mind video stream is in a 2D space, so it makes sense that we have to transform the corners to 3D space.


for (i = 0; i < corners.length; ++ i){
   corner = corners[i];
   // to 2D canvas space to 3D world space
   corner.x = corner.x - (canvas.width / 2);
   corner.y = (canvas.height/2) - corner.y;
}
Moreover, we need to apply this rotation matrix to the 3D model's rotation vector.
   dae.rotation.x = -Math.asin(-rotation[1][2]);
   dae.rotation.y = -Math.atan2(rotation[0][2], rotation[2][2]) - 90;
   dae.rotation.z = Math.atan2(rotation[1][0], rotation[1][1]);

At last, set the position to the 3D model.
   dae.position.x = translation[0];
   dae.position.y = translation[1];
   dae.position.z = -translation[2] * offsetScale;


Demo video: https://www.youtube.com/watch?v=68O5w1oIURM
Demo link: http://daoshengmu.github.io/ConsoleGameOnWeb/webar.html (Best for Firefox)

2016年7月17日 星期日

How to setup RustDT

RustDT is the IDE for Rust. If you are a guy like me who need a IDE for learning language and developing efficiently, you must have a try on RustDT(https://github.com/RustDT/RustDT/blob/latest/documentation/UserGuide.md#user-guide)

Enable code complete.















Here you go!

2016年1月31日 星期日

WebGL/VR on Worker thread

WebGL on main thread


As before. Developing a WebGL application, the only approach is put all the stuff at the main thread. But it would definitely bring some limitations for the performance. As the above picture shows, in a 3D game, it might need to do lots of stuff in a update frame. For example, updating the transformation of 3D objects, visibility culling, AI, network, and physics, etc. Then, we finally can hand over it to the render process for executing WebGL functions.

If we expect it could done all things in the V-Sync time (16 ms), it would be a challenge for developers. Therefore, people look forward if there is another way to share the performance bottleneck to other threads. WebGL on worker thread is happened under this situation. But, please don't consider anything put into WebWorker would resolve your problems totally. It will bring us some new challenge as well. Following, I would like to tell you how to use WebGL on worker to increase the performance and give you a physics WebGL demo that is based on three.js and cannon.js, even proving that I can integrate it with WebVR API as well.

WebWorker

First of all, I would like to introduce how to use WebWorker. WebWorker can help you execute your script at another thread to avoid pauses from the JavaScript Virtual Machine’s Garbage Collector. Therefore, this is a good idea for developers to use WebWorker to solve the performance bottleneck issue. The sample code is like below:

worker = new Worker("js/worker.js"); // load worker script
worker.onmessage = function( evt ) { // The receiver of worker's message
    //console.log('Message received from worker ' + evt.data );
};

worker.postMessage( { test: 'webgl_offscreen'); // Send message to worker

In worker.js
onmessage = function(evt) {
    //console.log( 'Message received from main script' );
    postMessage( 'Send script to the main script.' ); // Post message back to the main thread.
}

These script looks quite simple and we can start to put some computation at onmessage function in worker.js to relief the work of the main thread. However, we have to know WebWorker brings some constraints for us as well, it would make us feel inconvenient compare to the general JavaScript usage at the main thread.

The limitation of WebWorker are: 
 - Can't read/write DOM
 - Can't access global variable / function
 - Can't use file system (file://) to access local files
 - No requestAnimationFrame

WebGL on worker

After understanding how to use WebWorker and its constraints. Let's start to make our first WebGL on worker application.

The benefit of worker is we can put a part of tiny computation functions into another thread. In case of WebGL worker, we can put WebGL function calls into the Worker thread. So in the example of the above picture, I put my render part to the WebGL Worker.  Firefox Nightly has landed offscreencanvas feature for supporting WebGL on worker thread. In order to utilize this feature, we need to do some setup:
  • Download Firefox Nightly
  • Enter about:config, make gfx.offscreencanvas.enabled;true
Then, we have activated WebGL Worker. Go to finish it! The sample code is like below.
var canvas = document.getElementById('c');
canvas.width = window.innerWidth;
canvas.height = window.innerHeight;

var proxy = canvas.transferControlToOffscreen();   // new interface added by offscreencanvas for getting offscreen canvas
var worker = new Worker("js/gl_worker.js");
var positions = new Float32Array(num*3);           // Transferable object of web worker. Transformation info
var quaternions = new Float32Array(num*4);         // For the update/render functions to update their variable.
                                                   // in the main/worker threads.
var cameraState = new Float32Array(7);             // Camera state for the update/render functions

worker.onmessage = function( evt ) {               // worker message receiving function
    if ( evt.data.positions && evt.data.quaternions
    && evt.data.cameraState ) {
    
      positions = evt.data.positions;
      quaternions = evt.data.quaternions;
      cameraState = evt.data.cameraState;
      updateWorker();
    }
}

worker.postMessage( { canvas: proxy }, [proxy]);    // Send offscreenCanvas to worker

function updateWorker() {
    // Update camera state

    // Update position, quaternion

    // Send these buffer back the worker
    worker.postMessage( { cameraState: cameraState, positions: positions, quaternions: quaternions }, 
    [cameraState.buffer, positions.buffer, quaternions.buffer]);
}


In worker.js
var renderer;
var canvas;
var scene = null;
var camera = null;

onmessage = function( evt ) {      // Receiving messages from the main thread
  var window = self;
  
  if ( typeof evt.data.canvas !== 'undefined') {
    console.log( 'import script... ' );
    importScripts('../lib/three.js');             // load script at worker
    importScripts('../js/threejs/VREffect.js');
    importScripts('../js/threejs/TGALoader.js');

    canvas = evt.data.canvas;
    renderer = new THREE.WebGLRenderer( { canvas: canvas } ); // Initialize THREE.js WebGLRenderer
    scene = new THREE.Scene();
    camera = new THREE.PerspectiveCamera( 30, canvas.width / canvas.height, 0.5, 10000 );

    window.addEventListener( 'resize', onWindowResize, false ); // Register 'resize' event

    // Get bufffers that are sent from main thread.
    var cameraState = evt.data.cameraState;
    var positions = evt.data.positions;
    var quaternions = evt.data.quaternions;
    camera.position.set( cameraState[0], cameraState[1], cameraState[2] );
    camera.quaternion.set( cameraState[3], cameraState[4], cameraState[5], cameraState[6] );

    for ( var i = 0; i < visuals.length; i++ ) {    // Setup transformation info for visual objects in scene
      visuals[i].position.set(
        positions[3 * i + 0],
        positions[3 * i + 1],
        positions[3 * i + 2] );

      visuals[i].quaternion.set(
        quaternions[4 * i + 0],
        quaternions[4 * i + 1],
        quaternions[4 * i + 2],
        quaternions[4 * i + 3] );
    }

    render();        // Call render via the main thread requestAnimationTime

    postMessage({ cameraState: cameraState, positions:positions, quaternions:quaternions}
  , [ cameraState.buffer, positions.buffer, quaternions.buffer ]);  // Send back transferable 
                                                                                  // object to the main thread
  }
}

function render() {
    renderer.render( scene, camera );
    renderer.context.commit();       // New for webgl worker to end this frame
}

function onWindowResize( width, height ) {  // Resize window listener
  canvas.width = width;
  canvas.height = height;
  camera.aspect = canvas.width / canvas.height;
  camera.updateProjectionMatrix();
  renderer.setSize( canvas.width, canvas.height, false );
}

WebVR on Worker



Although most parameters of WebVR exist at dom API, the worker thread can't get them directly. But it is not a big deal, we can get them at the main thread and pass them to the worker.

In the main thread
var vrHMD;
function gotVRDevices( devices ) {
vrHMD = devices[ 0 ];
worker.postMessage( {        // Pass them to the worker
    eyeTranslationL: eyeTranslationL.x, 
    eyeTranslationR: eyeTranslationR.x, 
    eyeFOVLUp: eyeFOVL.upDegrees, eyeFOVLDown: eyeFOVL.downDegrees, 
    eyeFOVLLeft: eyeFOVL.leftDegrees, eyeFOVLRight: eyeFOVL.rightDegrees, 
    eyeFOVRUp: eyeFOVR.upDegrees, eyeFOVRDown: eyeFOVR.downDegrees, 
    eyeFOVRLeft: eyeFOVR.leftDegrees, eyeFOVRRight: eyeFOVR.rightDegrees });
}

function updateVR() {       // Update camera orientation via VR state
  var state = vrPosSensor.getState();

  if ( state.hasOrientation ) {
    camera.quaternion.set(
      state.orientation.x, 
      state.orientation.y, 
      state.orientation.z, 
      state.orientation.w);
}

function triggerFullscreen() {
    canvas.mozRequestFullScreen( { vrDisplay: vrHMD } );  // Fullscreen must be requested at the main thread.
}                                                         // Thankfully, it works for WebGL on worker.

In worker.js
var vrDeviceEffect = new THREE.VREffect(renderer);

onmessage = function(evt) {                // Send VRDevice to work for stereo render.
    vrDeviceEffect.eyeTranslationL.x = evt.data.eyeTranslationL;
    vrDeviceEffect.eyeTranslationR.x = evt.data.eyeTranslationR;
    vrDeviceEffect.eyeFOVL.upDegrees = evt.data.eyeFOVLUp;
    vrDeviceEffect.eyeFOVL.downDegrees = evt.data.eyeFOVLDown;
    vrDeviceEffect.eyeFOVL.leftDegrees = evt.data.eyeFOVLLeft;
    vrDeviceEffect.eyeFOVL.rightDegrees = evt.data.eyeFOVLRight;
    vrDeviceEffect.eyeFOVR.upDegrees = evt.data.eyeFOVRUp;
    vrDeviceEffect.eyeFOVR.downDegrees = evt.data.eyeFOVRDown;
    vrDeviceEffect.eyeFOVR.leftDegrees = evt.data.eyeFOVRLeft;
    vrDeviceEffect.eyeFOVR.rightDegrees = evt.data.eyeFOVRRight;
}

Others

Besides WebGL and WebVR, some problems that are solved when I made this demo. I list them and discuss how I solve them:
  - Can’t access DOM (read / modify)
    var workerCanvas = canvas.transferControlToOffscreen();
    worker.postMessage( {canvas: workerCanvas}, [workerCanvas] );
  - Can’t use filesystem (file://) to access local files
    Use XMLHttpRequest. Taking load texture as an example, in three.js, we need to use
var loader = new THREE.TGALoader();
var texture = loader.load( 'images/brick_bump.tga' );
var solidMaterial = new THREE.MeshLambertMaterial( { map: texture } );
  - No requestAnimationFrame
    Updating transferable objects and render need to via worker.onmessage, we have to execute the worker update at the main reqestAnimationFrame. This limitation would bring the chance of blocking by the main thread requestAnimationFrame because it possibly would happen GC pauses in the main thread and block the worker thread. The best solution is by looking forward the implementation of requestAnimationFrame for Worker.

Demo

Physics/WebGL on the main thread
Physics on the main thread, WebGL on worker
Source code