I got excited recently about Deep neural networks. I did some research and found out that running DNN in a GPU is 20X faster than in CPU. Wow!!! So that means you can setup a mini supercomputer at home. I went directly to shop spent $200 to buy Nvidia 750Ti video card with 620 CUDA cores and compute capability of 5.0 (very nice). Turns out that Nvidia supports CUDA SDK for running various scientific programs in GPU.  Checkout OpenCL which runs on ATI Radeon and Nvidia both but has limited support and libraries.

Caffe is a fast high performance Deep neural network library. It requires Nvidia GPU with CUDA support. I decided to install caffe on a 64 bit machine running ubuntu with 16gb ram and intel i5 quad core cpu and running caffe in ubuntu makes all the way easier. Running caffe in MacOS and windows is a real pain in the ass!!

Before we start, lets quickly check whether you have nvidia gpu  installed in your machine.

lspci | grep -i nvidia

Install CUDA SDK

Download CUDA for Linux 64 bit 900mb .run file by going here. Caffe will not work properly with 32 bit architecture

https://developer.nvidia.com/cuda-downloads

This entire package has Nvidia driver, CUDA Toolkit and CUDA samples. Note that

apt-get install cuda

will also work, but it does not come with samples and other extras, so i urge using the .run installation.

Lets prepare our system.

sudo apt-get update
sudo apt-get install g++ gcc build-essential

Right click on the downloaded .RUN file and mark it as executable under permissions.

Switch to command tty pressing CTRL+F1, stop the x-server graphics interface

sudo service lightdm stop

then run the installer

sudo ./cuda_6.5.0_linux_64.run

Finish the installation by installing, the nvidia driver, toolkit and samples and restart the xserver

sudo service lightdm start

Add these extra libraries, which are needed for CUDA

sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
Finally, we need to setup the systemwide paths needed for g++ compiler
sudo nano /etc/bash.bashrc
and copy and paste these lines
export PATH=/usr/local/cuda-6.5/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64:$LD_LIBRARY_PATH

Installing cuDNN

cuDNN is used by caffe for GPU acceleration, provided by CUDA, very much recommended for speed. Go to https://developer.nvidia.com/cuDNN, download .tar.gz file and extract.
There are 2 things to do with this.
1. Copy all the files, (except cudnn.h) to /usr/local/cuda-6.5/lib64
2. Copy the cudnn.h to /usr/local/cuda-6.5/include
Thats it!
Now that our CUDA environment is setup, we move on to installing Caffe

Installing Caffe

Download the caffe from http://caffe.berkeleyvision.org and follow the installation instructions http://caffe.berkeleyvision.org/installation.html
1. Download and unpack the caffe-master.tar.gz file
2. Install all the dependencies and libraries needed for caffe in ubuntu
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler

3.  Rename Makefile.config.example to Makefile.config and edit the necessary changes.

Since we have cuDNN installed, comment out that line, or if you like to use only CPU, commentout.

# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

4. Finally we compile caffe
make all
make test
make runtest

Build also python module and distribution libraries.

make pycaffe
make distribute
It should compile for about 5-10 minutes and if you dont see any errors, caffe is ready.
Note: I have tested caffe in 32 bit and it crashes, because of limited CUDA support to 32 bit machines. Se be sure you have 64 bit ubuntu installed.

Python module

Later you might want to work with python by importing caffe. Before you work set the correct path or adding this line bash.rc
export PYTHONPATH=/home/me/Desktop/caffe/python
and finally in python
import caffe
5. Lets test caffe by running a DNN on MINST dataset
Run these files to download the MINST dataset and convert the images data to lmdb
cd caffe-master
./data/mnist/get_mnist.sh 
./examples/mnist/create_mnist.sh

Run the trainer

./examples/mnist/train_lenet.sh

You will deep neural network run for 10,000 iterations on mnist dataset.

Accuracy: 99.04%
Iterations: 10,000
Time (CPU): 19'34s
Time (GPU): 1'04s

Wow! thats almost 20X speed in GPU.