nvidia windows docker gpu支持
https://blog.csdn.net/chenxizhan1995/article/details/117855448
重要的事情说三遍,需要升级到21390以上,需要升级到21390以上,需要升级到21390以上(因为windows11已经发布了,所以只要升级到win11就可以了)
另外,执行docker run --gpus all -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:20.03-tf2-py3
会报告下面找不到gpu的错误信息,但是据https://qiita.com/ksasaki/items/ee864abd74f95fea1efa网上文章说,这个属于误报,实际上是可以认识到gpu的。可以无视这个信息
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
至少我执行docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
是可以正确认识到rtx 3060 笔记本gpu的。从性能上看比1060快了一倍
Run "nbody -benchmark [-numbodies=
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies=
-device=
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Ampere" with compute capability 8.6
> Compute 8.6 CUDA device: [NVIDIA GeForce RTX 3060 Laptop GPU]
30720 bodies, total time for 10 iterations: 26.040 ms
= 362.407 billion interactions per second
= 7248.132 single-precision GFLOP/s at 20 flops per interaction