dnn | sunglint

TensorFlow Configuration and Optimization Notes

2017/02/10

Notes for installing TensorFlow on linux, with GPU enabled.

Background

TensorFlow is the second-generation ML framework from Google. (See this comparison of deep learning software.) The current state-of-the art image recognition models (inception-v3) use this framework.

Prequisites

Assuming Fedora 24 with Nvidia 1060 installed, running nvidia as opposed to nouveau drivers. See Fedora 24 Notes, and RPM Fusion’s installation page for installing the Nvidia drivers. In sum,

dnf install -y xorg-x11-drv-nvidia akmod-nvidia "kernel-devel-uname-r == $(uname -r)"
dnf install xorg-x11-drv-nvidia-cuda
dnf install vulkan

After, install some devel packages.

dnf install -y vulkan-devel

Download the Nvidia GPU CUDA Toolkit. The version used for this install is 8.0.61, and the network install for Fedora x86_64 was used.

This version of CUDA Toolkit is not C++11/C++14/C++17 aware. So, be aware! One way around this is to mod like below, and use -std=gnu++98.

117c117,118
 5
---
> /* bkoz use -std=c++98 if necessary */
> #if __GNUC__ > 6

Next, compile top-of-tree OpenCV (aka 3.2) with CUDA enabled. To do so, use the following configure list, mod for paths on system:

cmake -DVERBOSE=1 -DCMAKE_CXX_FLAGS="${CMAKE_CXX_FLAGS} -std=gnu++98 -Wno-deprecated-gpu-targets" -D BUILD_EXAMPLES=1 -D BUILD_DOCS=1 -D WITH_OPENNI=1 -D WITH_CUDA=1 -D CUDA_FAST_MATH=1 -D WITH_CUBLAS=1 -D WITH_FFMPEG=1 -D WITH_EIGEN=1 -D ENABLE_FAST_MATH=1 -D ENABLE_SSE3=1 -D ENABLE_AVX=1 -D CMAKE_BUILD_TYPE=RELEASE -D ENABLE_PRECOMPILED_HEADERS=OFF  -D CMAKE_INSTALL_PREFIX=/home/bkoz/bin/H-opencv -D OPENCV_EXTRA_MODULES_PATH=/home/bkoz/src/opencv_contrib.git/modules /home/bkoz/src/opencv.git/

Admittedly, this abuse of CMAKE_CXX_FLAGS is not optimal. Maybe EXTRA_CXX_FLAGS?

Now, for Nvidia cuDNN. The version used for this install is 5.1

When that is done, use pip to install TensorFlow.

sudo pip install --upgrade pip;
sudo pip install tensorflow-gpu

This should output something like:

Collecting tensorflow-gpu
  Downloading tensorflow_gpu-0.12.1-cp27-cp27mu-manylinux1_x86_64.whl (89.7MB)
    100% |████████████████████████████████| 89.7MB 19kB/s 
Requirement already satisfied: mock>=2.0.0 in /usr/lib/python2.7/site-packages (from tensorflow-gpu)
Requirement already satisfied: six>=1.10.0 in /usr/lib/python2.7/site-packages (from tensorflow-gpu)
Requirement already satisfied: numpy>=1.11.0 in /usr/lib64/python2.7/site-packages (from tensorflow-gpu)
Collecting protobuf>=3.1.0 (from tensorflow-gpu)
  Downloading protobuf-3.2.0-cp27-cp27mu-manylinux1_x86_64.whl (5.6MB)
    100% |████████████████████████████████| 5.6MB 284kB/s 
Collecting wheel (from tensorflow-gpu)
  Downloading wheel-0.29.0-py2.py3-none-any.whl (66kB)
    100% |████████████████████████████████| 71kB 3.3MB/s 
Requirement already satisfied: pbr>=0.11 in /usr/lib/python2.7/site-packages (from mock>=2.0.0->tensorflow-gpu)
Requirement already satisfied: funcsigs>=1 in /usr/lib/python2.7/site-packages (from mock>=2.0.0->tensorflow-gpu)
Requirement already satisfied: setuptools in /usr/lib/python2.7/site-packages (from protobuf>=3.1.0->tensorflow-gpu)
Installing collected packages: protobuf, wheel, tensorflow-gpu
Successfully installed protobuf-3.2.0 tensorflow-gpu-0.12.1 wheel-0.29.0

After this has completed, add in Keras.

Optimization

For Nvidia GPUs, take a look at this interesting post from Netflix. In sum, add

NVreg_CheckPCIConfigSpace=0

Filed under technology Tagged with 1060, c++, c++14, c++17, cnn, computer vision, cuda, dnn, fedora 24, gnu, gnu++98, GP106-400, gpu, image recognition, inception, linux, nn, nvidia, opencv, pattern matching, tensorflow

Notes on the Deep Deep Deepest

2017/01/27

Reading

Wiki on Machine Learning
Neural Nets and Deep Learning (UK)
Elements of Statistical Learning (Stanford)

Approaches

SVM (Support Vector Machines)
RBM (Restricted Boltzmann Machines)
NN/Convolution NN/DNN

Silicon Valley Fun

TensorFlow Dev Summit

February 15, 2017 @ googleplex, Mountain View, CA

Software

theano

python-theano-doc
python3-theano
python2-theano

tensorflow

TensorFlow
github
Current models for facial recognition include VGG-19, VGG-16, and inception-v3. Of the listed models, inception-v3 seems to have the advantage, at least as of early 2017.

keras

scikit-learn

info

opencv

GPU Hardware

Recommended GPUS are: Nvidia GTX 1080, 1070, 980, and 970. Maximize CUDA cores.

Filed under technology Tagged with c++, deep learning, dnn, linux, machine learning, neural network, python, vgg

sunglint

TensorFlow Configuration and Optimization Notes

Notes on the Deep Deep Deepest

Tags

Archives