Skip to main content

Setup TensorFlow with GPU support on Windows

TensorFlow with GPU support brings higher speed for computation than CPU-only. But, you need some additional settings especially for CUDA. First of all, we need to follow the guideline from https://www.tensorflow.org/install/install_windows. TensorFlow on Windows currently only has Python 3 support, I suggest to use python 3.5.3 or below. Then, install CUDA 8.0 and download cuDNN v6.0.

Then, move the files from cuDNN v6.0 that you already download to the path where you installed CUDA 8.0, like "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0", following the steps as below:

cudnn-8.0-windows10-x64-v6.0\cuda\bin\cudnn64_6.dll ---------------- CUDA\v8.0\bin
cudnn-8.0-windows10-x64-v6.0\cuda\include\cudnn.h ---------------------- CUDA\v8.0\include
cudnn-8.0-windows10-x64-v6.0\cuda\lib\x64\cudnn.lib --------------------- CUDA\v8.0\lib\x64

Don't need to add the folder path of cudnn-8.0-windows10-x64-v6.0 to your %PATH%. Now, we can start to confirm our installation is ready.

Steps:
1. Create a virtualenv under your working folder:
virtualenv --system-site-packages tensorflow
2. Activate it
tensorflow\Scripts\activate
It shows (tensorflow)$
3. Install TensorFlow with GPU support
pip3 install --upgrade tensorflow-gpu
4. Import TensorFlow to confirm it is ready
(tensorflow) %YOUR_PATH%\tensorflow>python
Python 3.5.2 [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
>>>

If it doesn't show anything, that means it works.
But, if you see error messages like, No module named '_pywrap_tensorflow_internal', you can take a look at issue 9469, 7705. It should be the cudnn version problem or cudnn can't be found. Please follow the method that I mentioned above.

 
 
 

Comments

Popular posts from this blog

tex2D vs. tex2Dproj

texCoord(  texX, texY, texZ , texW ) means texture coordinate after transforming from the texture matrix [ position goes through world, view, projection, and texture ] . We can use texCoord to fetch texture: 1.  float4 color = tex2D( sampler,  float2( texX / texW, texY/texW ); 2.  float4 color = tex2Dporj( sampler, float4( texX, texY, texZ, texW ) ); The top-two methods will have the same result. Because tex2Dproj operator supports divide w in its interface. http://www.gamedev.net/community/forums/topic.asp?topic_id=408894 http://bbs.gameres.com/showthread.asp?threadid=104316

Fast subsurface scattering

Fig.1 - Fast Subsurface scattering of Stanford Bunny Based on the implementation of three.js . It provides a cheap, fast, and convincing approach to do ray-tracing in translucent surfaces. It refers the sharing in GDC 2011  [1], and the approach is used by Frostbite 2 and Unity engines [1][2][3]. Traditionally, when a ray intersects with surfaces, it needs to calculate the bouncing result after intersections. Materials can be divided into three types roughly.  Opaque , lights can't go through its geometry and the ray will be bounced back.  Transparency , the ray passes and allow it through the surface totally, it probably would loose a little energy after leaving.  Translucency , the ray after entering the surface will be bounced internally like below Fig. 2. Fig.2 - BSSRDF [1] In the case of translucency, we have several subsurface scattering approaches to solve our problem. When a light is traveling inside the shape, that needs to consider the diffuse value influe

Bringing Large Scale Console Game to iOS

Y8ObbsOsNQEhMCEhMCEt The Bard's Tale Why? Device are fast enough Market segment is not as crowded Hugh potential user base Rich back data log of content Potentially very low cost Session1: port process - application framework  - development workflow Session 2 port-port - performance/memory opt. OpenGL Cocoa Touch App Bring it all together Workflow: Data deployment OpenAL limit 32 active sources, DirectSound 256 channels on XBOX -Sluggish -5.8G -> 2Glimited on iOS 60Hz?    - terriable for battery life 30 Hz   - Game may depend on 60Hz Hybrid    -   60hz, functionsliity intact   - 30hz : low GPU/ save energy    -60hz, at 5th device is optional VFP?? Candidate for NEON SIMD SGX GPUs: a few stats Render opt. -minimal vertex format size -texture size and mipmapping Alpha test and SGX - Fragment discard expensive -huge impact on fill rate -use alpha blend at all costs Eliminating Alpha blend