Centos7 英伟达显卡驱动和CUDA的安装
Centos7 英伟达显卡驱动和CUDA的安装笔记整理汇总.
硬件配置
- 主机:Centos 7.7 64位
- 显卡: GeForce GTX 1050
- kernel-release: 3.10.0-1127.el7.x86_64
- 驱动:Linux x64 (AMD64/EM64T) Display Driver Version 440.100.
- CUDA: Version 10.2
Results:
1
2
3
4
5
6
7
8
9
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1050 Off | 00000000:01:00.0 Off | N/A |
| N/A 44C P8 N/A / N/A | 0MiB / 2002MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
前期准备
安装依赖
要装的三个依赖分别是,gcc、kernel-devel、dkms,其中需要注意的是,kernel-devel的版本需要与当前内核的版本一致,不然后面会出现找不到文件的情况。
查看我的内核版本:
1
2
$ sudo uname -r
3.10.0-1127.el7.x86_64
查看一下可以安装的版本:
[root@host8 ~]# yum list | grep kernel-devel kernel-devel.x86_64 3.10.0-957.1.3.el7 updates
安装rpm包后,继续安装其他依赖:
yum -y install gcc dkms
查询驱动版本
- 首先导入公共密钥
$ sudo rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
-
安装ELRepo
1
$ sudo rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
-
安装 nvidia-detect
$ sudo yum install nvidia-detect
- 运行 nvidia-detect
1 2 3 4 5 6
$ sudo nvidia-detect -v Probing for supported NVIDIA devices... [10de:1c8d] NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] This device requires the current 440.64 NVIDIA driver kmod-nvidia [8086:591b] Intel Corporation HD Graphics 630 An Intel display controller was also detected
根据 nvidia-detect 的输出信息,可以知道显卡的型号,以及要使用的驱动版本 440.64。到官网下载最接近的驱动版本,这里下载 Version 440.100.
PS:通过命令
$ sudo yum search kmod-nvidia
可以查找源中有的驱动,但是还是官网下载好一点,至少安装包都有。
安装驱动
Most users of NVIDIA graphics cards prefer to use the drivers provided by NVIDIA. These more fully support the capabilities of the card when compared to the nouveau driver that is included with the distribution. These are the steps to install the NVIDIA driver and disable the nouveau driver.
Prepare your machine
1
2
3
yum -y update
yum -y groupinstall "GNOME Desktop" "Development Tools"
yum -y install kernel-devel
Download the latest NVIDIA driver for unix. ==> http://www.nvidia.com/object/unix.html ==> Latest Long Lived Branch version Note: If using a recently released, top end GTX or Tesla you may get more support with the Latest Short Lived Branch instead.
In order to have the NVIDIA drivers rebuilt automatically with future kernel updates you can also install the EPEL repository and the DKMS package. This is optional.
1
2
yum -y install epel-release
yum -y install dkms
Reboot your machine to make sure you are running the newest kernel
Edit /etc/default/grub
. Append the following to “GRUB_CMDLINE_LINUX”
rd.driver.blacklist=nouveau nouveau.modeset=0
Generate a new grub configuration to include the above changes.
grub2-mkconfig -o /boot/grub2/grub.cfg
Edit/create /etc/modprobe.d/blacklist.conf
and append:
blacklist nouveau
Backup your old initramfs and then build a new one
1
2
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)-nouveau.img
dracut /boot/initramfs-$(uname -r).img $(uname -r)
Reboot your machine
If your machine doesn’t boot to a login prompt disconnect your monitor from the graphics card and plug directly into the onboard VGA port. Alternatively SSH directly into the machine.
The NVIDIA installer will not run while X is running so switch to text mode:
systemctl isolate multi-user.target
Run the NVIDIA driver installer and enter yes to all options.
sh NVIDIA-Linux-x86_64-*.run
Reboot your machine
Solve can’t launch `nvidia-setting’ for X server.
1
2
$ nvidia-setting
ERROR: Unable to load info from any available system
Solution:
- Create /etc/X11/xorg.conf.d/11-nvidia.conf with contents
1 2 3 4 5 6 7
Section "OutputClass" Identifier "nvidia" MatchDriver "nvidia-drm" Driver "nvidia" Option "AllowEmptyInitialConfiguration" "true" Option "PrimaryGPU" "true" EndSection
- and for GDM/Gnome, create two files
optimus.desktop
in/etc/xdg/autostart/
and/usr/share/gdm/greeter/autostart/
containing1 2 3 4 5 6
[Desktop Entry] Type=Application Name=Optimus Exec=sh -c "xrandr --setprovideroutputsource modesetting NVIDIA-0; xrandr --auto" NoDisplay=true X-GNOME-Autostart-Phase=DisplayServer
CUDA 安装
Download the latest CUDA Toolkit (runfile installer option, not the rpm download) ==> https://developer.nvidia.com/cuda-downloads ==> Linux ==> x86_64 ==> RHEL/CentOS ==> 7 ==> runfile (local)
Run the CUDA installer.
sh cuda_*.run
Say no to installing the NVIDIA driver. The standalone driver you already installed is typically newer than what is packaged with CUDA. Use the default option for all other choices.
To add CUDA to your environment add the following files.
Create /etc/profile.d/cuda.sh
1
2
PATH=$PATH:/usr/local/cuda/bin
export PATH
Create /etc/profile.d/cuda.csh
1
set path = ( $path /usr/local/cuda/bin )
Create /etc/ld.so.conf.d/cuda.conf
1
/usr/local/cuda/lib64
cudnn 安装
到官网下载对应安装包,然后按以下顺序安装:
- runtime library:
$ sudo rpm -i libcudnn7-7.6.5.33-1.cuda10.2.x86_64.rpm
- develop library:
$ sudo rpm -i libcudnn7-devel-7.6.5.33-1.cuda10.2.x86_64.rpm
- doc and code samples:
$ sudo rpm -i libcudnn7-doc-7.6.5.33-1.cuda10.2.x86_64.rpm