1 of 28

Google Cloud TF-GPU Setup for Ubuntu 16.04

Prahlad Chari

pnchari@connect.ust.hk

2 of 28

PART 1

3 of 28

#1) Create an Instance

4 of 28

#2) Select appropriate servers

5 of 28

us-east1-d

asia_east1-a

europe_west1-b

The above servers are where you will find the option to enable GPU usage...

6 of 28

7 of 28

#3) Choosing GPU option

8 of 28

#4) Configuring your GPU

  • 1, 2, 4 or 8 GPU

  • Minimum cost = $590.61

9 of 28

PART 2

10 of 28

Brief Overview

1) Install NVIDIA Cuda

2) Install NVIDIA cuDNN

3) Install and upgrade PIP

4a) Install TensorFlow (without Bazel)

4b) Install Bazel

4b2) Install TensorFlow

4b3) Upgrade Protobuf

5) Test Installation

11 of 28

#1) Install Cuda

  • https://developer.nvidia.com/cuda-downloads → Go to this link
  • Linux → x86_64 → Ubuntu�→ 16.04 → deb [network]
  • Right-click the base installer and�copy the link
  • Use wget command to download�the base installer in your instance
  • Follow the instructions under the�installer

12 of 28

#2) Install cuDNN

  • To install cuDNN, you need to register on the NVIDIA website: https://developer.nvidia.com/cudnn
  • Similar to Cuda, we download the cuDNN installer that matches our Cuda version (e.g. cuDNN 5.1 for Cuda 8.0). Download the file on your local machine and SCP to our Google Cloud instance
  • Once the file has been downloaded, run this command to uncompress and move the cuDNN files to the cuda directory��$ sudo tar -xvf cudnn-8.0-linux-x64-v5.1.tgz -C /usr/loca

This might change based on your download!!!

13 of 28

#3) Install PIP

  • $ sudo apt-get install python-pip python-dev
  • $ pip install --upgrade pip

Our cuda installation should be in /usr/local/cuda at this point. We should now set the LD_LIBRARY_PATH and CUDA_HOME environment variables

  • $ export LD_LIBRARY_PATH=”$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64”
  • $ export CUDA_HOME=/usr/local/cuda

14 of 28

Method 1 (without Bazel)

15 of 28

#4a) Install TensorFlow-GPU using PIP

If you want to install TensorFlow without installing Bazel, then you can follow the tutorial from this point after installing Cuda, cuDNN and upgrading PIP

  • $ sudo apt-get install libcupti-dev → the NVIDIA CUDA Profile Tools Interface
  • $ pip install tensorflow-gpu → If this command succeeds, move to step 7

IF NOT

  • $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp27-none-linux_x86_64.whl

16 of 28

Method 2 (using Bazel)

17 of 28

#4b) Install Bazel

  • $ sudo apt-get install software-properties-common swig
  • $ sudo add-apt-repository ppa:webupd8team/java
  • $ sudo apt-get update
  • $ sudo apt-get install oracle-java8-installer

18 of 28

19 of 28

#4b2) Install TensorFlow

  • $ git clone https://github.com/tensorflow/tensorflow → directly from source
  • $ cd tensorflow
  • $ git reset --hard __________

In the blank space above, we should � include the ‘code’ of the latest commit� from the TensorFlow main branch.

Image on the right for reference →

20 of 28

Once we’ve installed TensorFlow, we should now configure it so that it can be setup to run with the Cuda GPUs / other dependencies as required…

  • $ ./configure

During the configure process, you should follow the prompts that appear on these slides as specified. For other prompts that are not included on these slides, you can choose the option according to your preference…

  • Please specify the location of python. Default is /usr/bin/python]: Press Enter
  • Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n

21 of 28

  • Do you wish to build TensorFlow with Cuda? [y/N] y
  • Please specify which gcc nvcc should use as the host compiler. [Default is /usr/bin/gcc]: Press Enter
  • Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
  • Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Press Enter
  • Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5
  • Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Press Enter

22 of 28

  • Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: “3.5, 5.2”]: Press Enter OR input your GPU’s values

After this step, the configuration should be finished successfully and you will get a message saying ‘Configuration finished’ after TensorFlow sets up various libraries and files based on your specifications.

23 of 28

Now we call bazel to build the TensorFlow pip package. This step takes a long time [generally more than 1 hour], so we can do other stuff meanwhile :)

  • bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

The first part of this script, i.e. ‘bazel build…’ until ‘...:build_pip_package’ takes the majority of the time. After this part is over, the rest of the script, i.e. ‘bazel-bin/…’ until ‘.../tensorflow_pkg’ will be in the command line, so don’t forget to press enter to execute the second part of the script.

  • sudo pip install --upgrade /tmp tensorflow_pkg/tensorflow-1.0.0-.whl

(The above command might differ slightly based on your version of TF)

24 of 28

#4b3) Upgrade Protobuf

  • $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/protobuf-3.0.0b2.post2-cp27-none-linux_x86_64.whl

25 of 28

#5) Test Installation!

  • $ python
  • >>> import tensorflow as tf
  • >>> hello = tf.constant(‘Hello, TensorFlow!’)
  • >>> sess = tf.Session()
  • >>> print(sess.run(hello))
  • Hello, TensorFlow!
  • >>> a = tf.constant(20)
  • >>> b = tf.constant(30)
  • >>> prrint(sess.run(a + b))
  • 50

26 of 28

One way to check if your TensorFlow is using GPU or not, is by doing the following:

After running a TensorFlow session by doing sess = tf.Session(), we are presented with information about ‘device 0’, which in this case is our NVIDIA Tesla K80 GPU

27 of 28

Another way to check if your GPU is working is by running nvidia-smi. If we run it while there is no training going on, we can see on the left that the memory used is very low and the power consumed by the GPU is low as well. Furthermore the GPU utilization is set at 0%.

On the right we can see that if we run nvidia-smi when our model is training, we can see that the power consumed significantly more, the memory taken up is 10914MiB and GPU-Util is at 92%. Furthermore, process ID (PID) 2740, aka python is added to the list of GPU processes, indicating that it is working.

28 of 28

END