Nvidia container cli nvml error driver not loaded
If you are getting this error when trying to use NVIDIA Cuda from Docker:
One of the reasons could be that you had a kernel update and the dkms driver didn't get rebuilt.
You can do a reinstall of the driver by doing:
dpkg -l 'nvidia-driver-*' | grep ii # To see which driver version you have installed
sudo apt install --reinstall nvidia-driver-XXX # replace with XXX with your version
sudo reboot
Test with
nvidia-smi
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
Mon Aug 21 17:52:23 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 960 Off | 00000000:01:00.0 Off | N/A |
| 0% 48C P2 27W / 160W | 796MiB / 4096MiB | 6% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2921 C frigate.detector.tensorrt 176MiB |
| 0 N/A N/A 2941 C ffmpeg 93MiB |
| 0 N/A N/A 2947 C ffmpeg 63MiB |
| 0 N/A N/A 2960 C ffmpeg 153MiB |
| 0 N/A N/A 2962 C ffmpeg 63MiB |
| 0 N/A N/A 2970 C ffmpeg 118MiB |
| 0 N/A N/A 2979 C ffmpeg 118MiB |
+---------------------------------------------------------------------------------------+