Monday, January 7, 2019

How to install CUDA 9 and CuDNN 7 on Ubuntu 18.04

Installing CUDA has gotten a lot easier over the years thanks to the CUDA Installation Guide, but there are still a few potential pitfalls to be avoided. Below is a working recipe for installing the CUDA 9 Toolkit and CuDNN 7 (the versions currently supported by TensorFlow) on Ubuntu 18.04.


Step 1: Verify your system requirements

The NVIDIA Developer Zone has a detailed guide on pre-installation actions. Most importantly, you should verify that your system has a CUDA-capable GPU. You can verify the GPU is being detected with the following command:

$ lspci | grep -i nvidia

This command should return one GPU per line, in my case: 02:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX TITAN X].

Apart from that, you will want to make sure you have GCC installed. If not, we will install it in Step 2. Verify the GCC version as follows:

$ gcc --version

Step 2: Install the required pre-installation packages

According to the NVIDIA documentation, this step is no longer strictly required, but it is good to have the following packages anyway:

$ sudo apt install g++ freeglut3-dev build-essential libx11-dev libxmu-dev \
     libxi-dev libglu1-mesa libglu1-mesa-dev

Also, CUDA 9 requires GCC 6:

$ sudo apt install gcc-6 g++-6

Step 3: Install the NVIDIA driver

Before installing the NVIDIA driver, you should make sure to disable the Nouveau drivers that come pre-installed with Ubuntu. The Nouveau drivers are loaded if the following command prints anything:

$ lsmod | grep nouveau

The following is straight from the CUDA docs. Create a file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents:

blacklist nouveau
options nouveau modeset=0

Then regenerate the kernel initramfs:

$ sudo update-initramfs -u

Then you're finally ready to install the NVIDIA driver. First, get the PPA repository driver:

$ sudo add-apt-repository ppa:graphics-drivers/ppa

Then install the driver. Make sure to get the development version, too:

$ sudo apt install nvidia-384 nvidia-384-dev

I would recommend rebooting your systems at this point. Some systems act up at this point, so make sure everything is displayed correctly once you log in to your machine again. If you get a bunch of display error messages, or can't see anything, drop to a virtual console by hitting Ctrl + Alt + F1. From here, check that the device files /dev/nvidia* exist and have the correct (0666) file permissions. More info here.

Installing the driver is usually the hardest part. If the above doesn't work, ask StackOverflow how to install an NVIDIA driver using a local run file. If you've made it past this point, the rest should be a breeze!


Step 4: Install the CUDA Toolkit

Download one of the "runfile (local)" installation packages from the CUDA Toolkit Archive:

$ wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/
     cuda_9.0.176_384.81_linux-run

Make the download file executable and start the installation:

$ chmod +x cuda_9.0.176_384.81_linux.run 
$ sudo ./cuda_9.0.176_384.81_linux.run --override

During the installation, you need to answer the following questions:

  • You are attempting to install on an unsupported configuration. Do you wish to continue? => yes
  • Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81? => no
  • Install the CUDA 9.0 Toolkit? => yes
  • Create symbolic links? => yes

You can also choose the installation directory and whether to install the CUDA samples. Specify custom paths or hit ENTER to use default values.

Step 5: Install CuDNN

In order to download CuDNN you have to be registered with the NVIDIA Developer Program. You can sign up here.

Then download CuDNN 7.2 from the Developer website. Choose the Linux file.

$ CUDNN_FILE="cudnn-9.0-linux-x64-v7.2.1.38"
$ wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.2.1/
      prod/9.0_20180806/${CUDNN_FILE}
$ tar -xzvf ${CUDNN_FILE}

Copy the following files into the CUDA Toolkit directory (this path might differ if you did not install CUDA in its default location):

$ sudo cp -P cuda/include/cudnn.h /usr/local/cuda-9.0/include
$ sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64/
$ sudo chmod a+r /usr/local/cuda-9.0/lib64/libcudnn*

Step 6: Verify your installation

If everything worked right, you should be able to run the following two commands without error:

$ nvidia-smi
$ nvcc -V

You might have to open a new terminal or reboot your machine. The first command should list all GPUs attached to your system, and the second command should say CUDA 9.0.

And with that, you're done. Happy coding!