User Tools

Site Tools


ace-gpu-1_installation_log

====== ace-gpu-1 installation log ====== Log of platform setup and configuration ===== Base ===== ===== NVIDIA Driver ===== This is probably not required as a driver is included with CUDA. <code> chmod 770 NVIDIA-Linux-x86_64-375.26.run /etc/init.d/lightdm stop ./NVIDIA-Linux-x86_64-375.26.run reboot </code> ===== CUDA ==== Installation of CUDA from debian package ==== Confirm GPU ==== <code> lspci | grep -i nvidia </code> Output example <code> 01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1) 01:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1) 02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1) 02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1) 03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1) 03:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1) </code> ==== gcc version ==== <code> gcc --version </code> <code> gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. </code> ==== ensure for headers ==== <code> sudo apt-get install linux-headers-$(uname -r) </code> ==== Download CUDA Toolkit ==== <code> wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb </code> ==== Confirm checksum ==== <code> wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/docs/sidebar/md5sum-txt </code> <code> md5sum cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb cat md5sum-txt | grep cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64 </code> ==== Install ==== <code> mv cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb sudo apt-get update sudo apt-get install cuda </code> Reboot system. ==== Environment Setup ==== === Add CUDA bin path === </code> export PATH=/usr/local/cuda-8.0/bin:${PATH} echo $PATH <code> === Ensure for LD_LIBRARY_PATH === Ensure LD_LIBRARY_PATH includes `/usr/local/cuda-8.0/lib64` <code> echo $LD_LIBRARY_PATH </code> If not set set using: <code> export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64 </code> otherwise something like this: <code> export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH </code> ==== /etc/skel ==== Configure `/etc/skel` so that new users have the proper environment configuration <code> nano /etc/skel/.profile </code> Content example <code> # set user PATH to include /usr/local/cuda-8.0/bin if [ -d "/usr/local/cuda-8.0/bin" ]; then PATH="/usr/local/cuda-8.0/bin:$PATH" fi # set user LD_LIBRARY_PATH to include /usr/local/cuda-8.0/lib64 if [ -d "/usr/local/cuda-8.0/lib64" ]; then LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH" </code> <code> </code> <code> </code> === GPU Accounting Setup and Configuration === The CUDA nvidia-persistenced needs to be configured for the target OS's startup system. In the case of Ubuntu 16.04 this would be systemd. == Confirm Driver Version == <code> nvidia-smi </code> Output example <code> Wed Apr 26 14:20:40 2017 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.39 Driver Version: 375.39 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 TITAN X (Pascal) Off | 0000:01:00.0 On | N/A | | 41% 68C P2 95W / 250W | 1902MiB / 12186MiB | 96% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1977 G /usr/lib/xorg/Xorg 60MiB | | 0 10406 C /data1/data/kwagstyl/anaconda2/bin/python 189MiB | | 0 10660 C /data1/data/kwagstyl/anaconda2/bin/python 1495MiB | | 0 16118 C ...freesurfer_LBL/bin/mris_fix_topology_cuda 153MiB | +-----------------------------------------------------------------------------+ </code> == Download and uncompress == Download nvidia-persistenced version that matches your driver version (see output example above) <code> mkdir -p ~/src/ubuntu/16.04/nvidia cd ~/src/ubuntu/16.04/nvidia wget ftp://download.nvidia.com/XFree86/nvidia-persistenced/nvidia-persistenced-375.39.tar.bz2 tar xvjf nvidia-persistenced-375.39.tar.bz2 </code> == Edit the nvidia-persistenced.conf.template == = Confirm creation of the nvidia-persistenced user = <code> sudo cat /etc/passwd | grep nvidia </code> Output example: <code> nvidia-persistenced:x:126:132:NVIDIA Persistence Daemon,,,:/:/sbin/nologin </code> == Edit the systemd template == <code> cd nvidia-persistenced-375.39/init/systemd nano nvidia-persistenced.service.template </code> Replace __USER__ with the nvidia-persistence users name `nvidia-persistenced` == Run the installer == <code> cd ~/sys/sw/ubuntu/16.04/nvidia/nvidia-persistenced-375.39/init sudo ./install.sh </code> = Output example = <code> Checking for common requirements... sed found in PATH? Yes useradd found in PATH? Yes userdel found in PATH? Yes id found in PATH? Yes Common installation/uninstallation supported Creating sample System V script... done. Creating sample systemd service file... done. Creating sample Upstart service file... done. Checking for systemd requirements... /usr/lib/systemd/system directory exists? No /etc/systemd/system directory exists? Yes systemctl found in PATH? Yes systemd installation/uninstallation supported Installation parameters: User : nvidia-persistenced Group : nvidia-persistenced systemd service installation path : /etc/systemd/system User 'nvidia-persistenced' already exists, skipping useradd... User 'nvidia-persistenced' is in primary group 'nvidia-persistenced'. Stopping nvidia-persistenced.service... done. Installing sample systemd service nvidia-persistenced.service... done. Enabling nvidia-persistenced.service... done. Starting nvidia-persistenced.service... done. </code> == Check == <code> sudo service nvidia-persistenced status </code> == Troubleshooting == add "--persistence-mode --verbose" to the line where the service is started.

ace-gpu-1_installation_log.txt · Last modified: 2018/04/04 13:00 by xteng