User Tools

Site Tools


ace-gpu-1_installation_log

====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
ace-gpu-1_installation_log [2017/02/16 15:07]
csteel
ace-gpu-1_installation_log [2018/04/04 12:58]
xteng [Confirm GPU]
Line 4: Line 4:
  
 ===== Base ===== ===== Base =====
 +
  
 ===== NVIDIA Driver ===== ===== NVIDIA Driver =====
Line 29: Line 30:
  
 <​code>​ <​code>​
-01:00.0 VGA compatible controller: NVIDIA Corporation Device ​1b00 (rev a1)+01:00.0 VGA compatible controller: NVIDIA Corporation Device ​1b06 (rev a1)
 01:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1) 01:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +03:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +
 </​code>​ </​code>​
  
Line 129: Line 135:
 <​code>​ <​code>​
 </​code>​ </​code>​
 +
 +=== GPU Accounting Setup and Configuration ===
 +
 +The CUDA nvidia-persistenced needs to be configured for the target OS's startup system. In the case of Ubuntu 16.04 this would be systemd. ​
 +
 +== Confirm Driver Version ==
 +
 <​code>​ <​code>​
 +nvidia-smi
 +</​code>​
 +
 +Output example
 +
 +<​code>​
 +Wed Apr 26 14:20:40 2017       
 ++-----------------------------------------------------------------------------+
 +| NVIDIA-SMI 375.39 ​                ​Driver Version: 375.39 ​                   |
 +|-------------------------------+----------------------+----------------------+
 +| GPU  Name        Persistence-M| Bus-Id ​       Disp.A | Volatile Uncorr. ECC |
 +| Fan  Temp  Perf  Pwr:​Usage/​Cap| ​        ​Memory-Usage | GPU-Util ​ Compute M. |
 +|===============================+======================+======================|
 +|   ​0 ​ TITAN X (Pascal) ​   Off  | 0000:​01:​00.0 ​     On |                  N/A |
 +| 41%   ​68C ​   P2    95W / 250W |   ​1902MiB / 12186MiB |     ​96% ​     Default |
 ++-------------------------------+----------------------+----------------------+
 +                                                                               
 ++-----------------------------------------------------------------------------+
 +| Processes: ​                                                      GPU Memory |
 +|  GPU       ​PID ​ Type  Process name                               ​Usage ​     |
 +|=============================================================================|
 +|    0      1977    G   /​usr/​lib/​xorg/​Xorg ​                             60MiB |
 +|    0     ​10406 ​   C   /​data1/​data/​kwagstyl/​anaconda2/​bin/​python ​     189MiB |
 +|    0     ​10660 ​   C   /​data1/​data/​kwagstyl/​anaconda2/​bin/​python ​    ​1495MiB |
 +|    0     ​16118 ​   C   ​...freesurfer_LBL/​bin/​mris_fix_topology_cuda ​  ​153MiB |
 ++-----------------------------------------------------------------------------+
 +</​code>​
 +
 +== Download and uncompress ==
 +
 +Download nvidia-persistenced version that matches your driver version (see output example above)
 +
 +<​code>​
 +mkdir -p ~/​src/​ubuntu/​16.04/​nvidia
 +cd ~/​src/​ubuntu/​16.04/​nvidia
 +wget ftp://​download.nvidia.com/​XFree86/​nvidia-persistenced/​nvidia-persistenced-375.39.tar.bz2
 +tar xvjf nvidia-persistenced-375.39.tar.bz2
 +</​code>​
 +
 +== Edit the nvidia-persistenced.conf.template ==
 +
 += Confirm creation of the nvidia-persistenced user = 
 +
 +<​code>​
 +sudo cat /etc/passwd | grep nvidia
 +</​code>​
 +
 +Output example:
 +
 +<​code>​
 +nvidia-persistenced:​x:​126:​132:​NVIDIA Persistence Daemon,,,:/:/​sbin/​nologin
 +</​code>​
 +
 +== Edit the systemd template ==
 +
 +<​code>​
 +cd nvidia-persistenced-375.39/​init/​systemd
 +nano nvidia-persistenced.service.template
 +</​code>​
 +
 +Replace __USER__ with the nvidia-persistence users name `nvidia-persistenced`
 +
 +== Run the installer ==
 +
 +<​code>​
 +cd ~/​sys/​sw/​ubuntu/​16.04/​nvidia/​nvidia-persistenced-375.39/​init
 +sudo ./​install.sh
 +</​code>​
 +
 += Output example =
 +
 +<​code>​
 +Checking for common requirements...
 +  sed found in PATH?  Yes
 +  useradd found in PATH?  Yes
 +  userdel found in PATH?  Yes
 +  id found in PATH?  Yes
 +Common installation/​uninstallation supported
 +
 +Creating sample System V script... done.
 +Creating sample systemd service file... done.
 +Creating sample Upstart service file... done.
 +
 +Checking for systemd requirements...
 +  /​usr/​lib/​systemd/​system directory exists? ​ No
 +  /​etc/​systemd/​system directory exists? ​ Yes
 +  systemctl found in PATH?  Yes
 +systemd installation/​uninstallation supported
 +
 +Installation parameters:
 +  User  : nvidia-persistenced
 +  Group : nvidia-persistenced
 +  systemd service installation path : /​etc/​systemd/​system
 +
 +User '​nvidia-persistenced'​ already exists, skipping useradd...
 +User '​nvidia-persistenced'​ is in primary group '​nvidia-persistenced'​.
 +Stopping nvidia-persistenced.service... done.
 +Installing sample systemd service nvidia-persistenced.service... done.
 +Enabling nvidia-persistenced.service... done.
 +Starting nvidia-persistenced.service... done.
 +</​code>​
 +
 +== Check ==
 +
 +<​code>​
 +sudo service nvidia-persistenced status
 +</​code>​
 +
 +== Troubleshooting ==
 +
 +add "​--persistence-mode --verbose"​ to the line where the service is started.
  
ace-gpu-1_installation_log.txt ยท Last modified: 2018/04/04 13:00 by xteng