Posts tagged: passthrough

NVIDIA GPU ‘passthrough’ to lxc containers on Proxmox 6 for NVENC in Plex

By , 2020-12-06 20:19

I’ve found multiple guides on how to enable NVIDIA GPU access from lxc containers, however I had to combine the information from multiple sources to get a fully working setup. Here are the steps that worked for me.

  1. Install dkms on your Proxmox host to ensure the nvidia driver can be auto-updated with new kernel versions.
    # apt install dkms
  2. Head over to https://github.com/keylase/nvidia-patch and get the latest supported Nvidia binary driver version listed there.
  3. Download the nvidia-patch repo
    git clone https://github.com/keylase/nvidia-patch.git
  4. Install the driver from step 2 on the host.
    For example, ./NVIDIA-Linux-x86_64-455.45.01.run
  5. Run the nvidia-patch/patch.sh script on the host.
  6. Install the same driver in each container that needs access to the Nvidia GPU, but without the kernel module.
    ./NVIDIA-Linux-x86_64-455.45.01.run --no-kernel-module
  7. Run the nvidia-patch/patch.sh script on the lxc container.
  8. On the host, create a script to initialize the nvidia-uvm devices. Normally these are created on the fly when a program such as ffmpeg calls upon the GPU, but since we need to pass the device nodes through to the containers, they must exist before the containers are started.

    I saved the following script as /usr/local/bin/nvidia-uvm-init. Make sure to chmod +x !
#!/bin/bash
## Script to initialize nvidia device nodes.
## https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-verifications
/sbin/modprobe nvidia
if [ "$?" -eq 0 ]; then
  # Count the number of NVIDIA controllers found.
  NVDEVS=`lspci | grep -i NVIDIA`
  N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
  NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
  N=`expr $N3D + $NVGA - 1`
  for i in `seq 0 $N`; do
    mknod -m 666 /dev/nvidia$i c 195 $i
  done
  mknod -m 666 /dev/nvidiactl c 195 255
else
  exit 1
fi
/sbin/modprobe nvidia-uvm
if [ "$?" -eq 0 ]; then
  # Find out the major device number used by the nvidia-uvm driver
  D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
  mknod -m 666 /dev/nvidia-uvm c $D 0
  mknod -m 666 /dev/nvidia-uvm-tools c $D 0
else
  exit 1
fi

Next, we create the following two systemd service files to start this script, and the nvidia-persistenced:

/usr/local/lib/systemd/system/nvidia-uvm-init.service

# nvidia-uvm-init.service
# loads nvidia-uvm module and creates /dev/nvidia-uvm device nodes
[Unit]
Description=Runs /usr/local/bin/nvidia-uvm-init
[Service]
ExecStart=/usr/local/bin/nvidia-uvm-init
[Install]
WantedBy=multi-user.target

/usr/local/lib/systemd/system/nvidia-persistenced.service

# NVIDIA Persistence Daemon Init Script
#
# Copyright (c) 2013 NVIDIA Corporation
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
#
# This is a sample systemd service file, designed to show how the NVIDIA
# Persistence Daemon can be started.
#
[Unit]
Description=NVIDIA Persistence Daemon
Wants=syslog.target
[Service]
Type=forking
ExecStart=/usr/bin/nvidia-persistenced --user nvidia-persistenced
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
[Install]
WantedBy=multi-user.target

Next, symlink the two service definition files into /etc/systemd/system

# cd /etc/systemd/system
# ln -s /usr/local/lib/systemd/system/nvidia-uvm-init.service
# ln -s /usr/local/lib/systemd/system/nvidia-persistenced.service

and load the services

# systemctl daemon-reload
# systemctl start nvidia-uvm-init.service
# systemctl start nvidia-persistenced.service

Now you should see all the nvidia device nodes have been created
# ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195, 0 Dec 6 18:07 /dev/nvidia0
crw-rw-rw- 1 root root 195, 1 Dec 6 18:10 /dev/nvidia1
crw-rw-rw- 1 root root 195, 255 Dec 6 18:07 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Dec 6 18:12 /dev/nvidia-modeset
crw-rw-rw- 1 root root 511, 0 Dec 6 19:00 /dev/nvidia-uvm
crw-rw-rw- 1 root root 511, 0 Dec 6 19:00 /dev/nvidia-uvm-tools


/dev/nvidia-caps:
total 0
cr-------- 1 root root 236, 1 Dec 6 18:07 nvidia-cap1
cr--r--r-- 1 root root 236, 2 Dec 6 18:07 nvidia-cap2

Check the dri devices as well
# ls -l /dev/dri*
total 0
drwxr-xr-x 2 root root 100 Dec 6 17:00 by-path
crw-rw---- 1 root video 226, 0 Dec 6 17:00 card0
crw-rw---- 1 root video 226, 1 Dec 6 17:00 card1
crw-rw---- 1 root render 226, 128 Dec 6 17:00 renderD128

Take note of the first number of each device after the group name. In the listings above I have 195, 511, 236 and 226.

Now we need to edit the lxc container configuration file to pass through the devices. Shut down your container, then edit the config file – example /etc/pve/lxc/117.conf. The relevant lines are below the swap: 8192 line

arch: amd64
cores: 12
features: mount=cifs
hostname: plex
memory: 8192
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.1.1,hwaddr=4A:50:52:00:00:00,ip=192.168.1.122/24,type=veth
onboot: 1
ostype: debian
rootfs: local-lvm:vm-117-disk-0,size=250G,acl=1
startup: order=99
swap: 8192
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 226:* rwm
lxc.cgroup.devices.allow: c 236:* rwm
lxc.cgroup.devices.allow: c 511:* rwm
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/nvidia-caps dev/nvidia-caps none bind,optional,create=dir
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

Now, start your container back up. You should be able to use NVENC features. You can test by using ffmpeg:
$ ffmpeg -i dQw4w9WgXcQ.mp4 -c:v h264_nvenc -c:a copy /tmp/rickroll.mp4

You should now have working GPU transcode in your lxc container!

If you get the following error, recheck and make sure you have set the correct numeric values for lxc.cgroup.devices.allow and restart your container.

[h264_nvenc @ 0x559f2a536b40] Cannot init CUDA
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width
or height
Conversion failed!

Another way to tell the values are incorrect is having blank (———) permission lines for the nvidia device nodes. You will get this inside any containers that are started before the nvidia devices are initialized by the nvidia-uvm-init script on the host.

$ ls -l /dev/nvidia*
---------- 1 root root        0 Dec  6 18:04 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Dec  6 19:02 /dev/nvidiactl
---------- 1 root root        0 Dec  6 18:04 /dev/nvidia-modeset
crw-rw-rw- 1 root root 511,   0 Dec  6 19:02 /dev/nvidia-uvm
crw-rw-rw- 1 root root 511,   1 Dec  6 19:02 /dev/nvidia-uvm-tools

Sometimes, after the host has been up for a long time, the /dev/nvidia-uvm or other device nodes may disappear. In this case, simply run the nvidia-uvm-init script, perhaps schedule it to run as a cron job.

Custom theme by me. Based on Panorama by Themocracy