User Tools

Site Tools


ace-gpu-1_installation_log

====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
ace-gpu-1_installation_log [2017/02/15 17:41]
csteel created
ace-gpu-1_installation_log [2018/04/04 13:00] (current)
xteng [/etc/skel]
Line 2: Line 2:
  
 Log of platform setup and configuration Log of platform setup and configuration
 +
 +===== Base =====
 +
 +
 +===== NVIDIA Driver =====
 +
 +This is probably not required as a driver is included with CUDA.
 +
 +<​code>​
 +chmod 770 NVIDIA-Linux-x86_64-375.26.run ​
 +/​etc/​init.d/​lightdm stop
 +./​NVIDIA-Linux-x86_64-375.26.run ​
 +reboot
 +</​code>​
 +
 +===== CUDA ====
 +
 +Installation of CUDA from debian package
 +
 +==== Confirm GPU ====
 +
 +<​code>​
 +lspci | grep -i nvidia
 +</​code>​
 +
 +Output example
 +
 +<​code>​
 +01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +01:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +03:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +
 +</​code>​
 +
 +==== gcc version ====
 +
 +<​code>​
 +gcc --version
 +</​code>​
 +
 +<​code>​
 +gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
 +Copyright (C) 2015 Free Software Foundation, Inc.
 +This is free software; see the source for copying conditions. ​ There is NO
 +warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 +</​code>​
 +
 +==== ensure for headers ====
 +
 +<​code>​
 +sudo apt-get install linux-headers-$(uname -r)
 +</​code>​
 +
 +==== Download CUDA Toolkit ====
 +
 +<​code>​
 +wget https://​developer.nvidia.com/​compute/​cuda/​8.0/​Prod2/​local_installers/​cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
 +</​code>​
 +
 +==== Confirm checksum ====
 +
 +<​code>​
 +wget https://​developer.nvidia.com/​compute/​cuda/​8.0/​Prod2/​docs/​sidebar/​md5sum-txt
 +</​code>​
 +
 +<​code>​
 +md5sum cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
 +cat md5sum-txt | grep cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64
 +</​code>​
 +
 +==== Install ====
 +
 +<​code>​
 +mv cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
 +sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
 +sudo apt-get update
 +sudo apt-get install cuda
 +</​code>​
 +
 +Reboot system.
 +
 +==== Environment Setup ====
 +
 +=== Add CUDA bin path ===
 +
 +</​code>​
 +export PATH=/​usr/​local/​cuda-8.0/​bin:​${PATH}
 +echo $PATH
 +<​code>​
 +
 +=== Ensure for LD_LIBRARY_PATH ===
 +
 +Ensure LD_LIBRARY_PATH includes `/​usr/​local/​cuda-8.0/​lib64`
 +
 +<​code>​
 +echo $LD_LIBRARY_PATH
 +</​code>​
 +
 +If not set set using:
 +
 +<​code>​
 +export LD_LIBRARY_PATH=/​usr/​local/​cuda-8.0/​lib64
 +</​code>​
 +
 +otherwise something like this:
 +
 +<​code>​
 +export LD_LIBRARY_PATH=/​usr/​local/​cuda-8.0/​lib64:​$LD_LIBRARY_PATH
 +</​code>​
 +
 +==== /etc/skel ====
 +
 +Configure `/etc/skel` so that new users have the proper environment configuration
 +
 +<​code>​
 +nano /​etc/​skel/​.profile
 +</​code>​
 +
 +Content example
 +
 +<​code>​
 +# set user PATH to include /​usr/​local/​cuda-8.0/​bin
 +if [ -d "/​usr/​local/​cuda-8.0/​bin"​ ]; then
 +    PATH="/​usr/​local/​cuda-8.0/​bin:​$PATH"​
 +fi
 +
 +# set user LD_LIBRARY_PATH to include /​usr/​local/​cuda-8.0/​lib64
 +if [ -d "/​usr/​local/​cuda-8.0/​lib64"​ ]; then
 +    LD_LIBRARY_PATH="/​usr/​local/​cuda-8.0/​lib64:​$LD_LIBRARY_PATH"​
 +
 +
 +</​code>​
 +<​code>​
 +</​code>​
 +<​code>​
 +</​code>​
 +
 +=== GPU Accounting Setup and Configuration ===
 +
 +The CUDA nvidia-persistenced needs to be configured for the target OS's startup system. In the case of Ubuntu 16.04 this would be systemd. ​
 +
 +== Confirm Driver Version ==
 +
 +<​code>​
 +nvidia-smi
 +</​code>​
 +
 +Output example
 +
 +<​code>​
 +Wed Apr 26 14:20:40 2017       
 ++-----------------------------------------------------------------------------+
 +| NVIDIA-SMI 375.39 ​                ​Driver Version: 375.39 ​                   |
 +|-------------------------------+----------------------+----------------------+
 +| GPU  Name        Persistence-M| Bus-Id ​       Disp.A | Volatile Uncorr. ECC |
 +| Fan  Temp  Perf  Pwr:​Usage/​Cap| ​        ​Memory-Usage | GPU-Util ​ Compute M. |
 +|===============================+======================+======================|
 +|   ​0 ​ TITAN X (Pascal) ​   Off  | 0000:​01:​00.0 ​     On |                  N/A |
 +| 41%   ​68C ​   P2    95W / 250W |   ​1902MiB / 12186MiB |     ​96% ​     Default |
 ++-------------------------------+----------------------+----------------------+
 +                                                                               
 ++-----------------------------------------------------------------------------+
 +| Processes: ​                                                      GPU Memory |
 +|  GPU       ​PID ​ Type  Process name                               ​Usage ​     |
 +|=============================================================================|
 +|    0      1977    G   /​usr/​lib/​xorg/​Xorg ​                             60MiB |
 +|    0     ​10406 ​   C   /​data1/​data/​kwagstyl/​anaconda2/​bin/​python ​     189MiB |
 +|    0     ​10660 ​   C   /​data1/​data/​kwagstyl/​anaconda2/​bin/​python ​    ​1495MiB |
 +|    0     ​16118 ​   C   ​...freesurfer_LBL/​bin/​mris_fix_topology_cuda ​  ​153MiB |
 ++-----------------------------------------------------------------------------+
 +</​code>​
 +
 +== Download and uncompress ==
 +
 +Download nvidia-persistenced version that matches your driver version (see output example above)
 +
 +<​code>​
 +mkdir -p ~/​src/​ubuntu/​16.04/​nvidia
 +cd ~/​src/​ubuntu/​16.04/​nvidia
 +wget ftp://​download.nvidia.com/​XFree86/​nvidia-persistenced/​nvidia-persistenced-375.39.tar.bz2
 +tar xvjf nvidia-persistenced-375.39.tar.bz2
 +</​code>​
 +
 +== Edit the nvidia-persistenced.conf.template ==
 +
 += Confirm creation of the nvidia-persistenced user = 
 +
 +<​code>​
 +sudo cat /etc/passwd | grep nvidia
 +</​code>​
 +
 +Output example:
 +
 +<​code>​
 +nvidia-persistenced:​x:​126:​132:​NVIDIA Persistence Daemon,,,:/:/​sbin/​nologin
 +</​code>​
 +
 +== Edit the systemd template ==
 +
 +<​code>​
 +cd nvidia-persistenced-375.39/​init/​systemd
 +nano nvidia-persistenced.service.template
 +</​code>​
 +
 +Replace __USER__ with the nvidia-persistence users name `nvidia-persistenced`
 +
 +== Run the installer ==
 +
 +<​code>​
 +cd ~/​sys/​sw/​ubuntu/​16.04/​nvidia/​nvidia-persistenced-375.39/​init
 +sudo ./​install.sh
 +</​code>​
 +
 += Output example =
 +
 +<​code>​
 +Checking for common requirements...
 +  sed found in PATH?  Yes
 +  useradd found in PATH?  Yes
 +  userdel found in PATH?  Yes
 +  id found in PATH?  Yes
 +Common installation/​uninstallation supported
 +
 +Creating sample System V script... done.
 +Creating sample systemd service file... done.
 +Creating sample Upstart service file... done.
 +
 +Checking for systemd requirements...
 +  /​usr/​lib/​systemd/​system directory exists? ​ No
 +  /​etc/​systemd/​system directory exists? ​ Yes
 +  systemctl found in PATH?  Yes
 +systemd installation/​uninstallation supported
 +
 +Installation parameters:
 +  User  : nvidia-persistenced
 +  Group : nvidia-persistenced
 +  systemd service installation path : /​etc/​systemd/​system
 +
 +User '​nvidia-persistenced'​ already exists, skipping useradd...
 +User '​nvidia-persistenced'​ is in primary group '​nvidia-persistenced'​.
 +Stopping nvidia-persistenced.service... done.
 +Installing sample systemd service nvidia-persistenced.service... done.
 +Enabling nvidia-persistenced.service... done.
 +Starting nvidia-persistenced.service... done.
 +</​code>​
 +
 +== Check ==
 +
 +<​code>​
 +sudo service nvidia-persistenced status
 +</​code>​
 +
 +== Troubleshooting ==
 +
 +add "​--persistence-mode --verbose"​ to the line where the service is started.
 +
ace-gpu-1_installation_log.1487180478.txt.gz ยท Last modified: 2017/02/15 17:41 by csteel