nvidia windows docker gpu支持

https://blog.csdn.net/chenxizhan1995/article/details/117855448

重要的事情说三遍,需要升级到21390以上,需要升级到21390以上,需要升级到21390以上(因为windows11已经发布了,所以只要升级到win11就可以了)

另外,执行docker run --gpus all -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:20.03-tf2-py3

会报告下面找不到gpu的错误信息,但是据https://qiita.com/ksasaki/items/ee864abd74f95fea1efa网上文章说,这个属于误报,实际上是可以认识到gpu的。可以无视这个信息

WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.

至少我执行docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
是可以正确认识到rtx 3060 笔记本gpu的。从性能上看比1060快了一倍
Run "nbody -benchmark [-numbodies=]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2.... for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy= (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Ampere" with compute capability 8.6

> Compute 8.6 CUDA device: [NVIDIA GeForce RTX 3060 Laptop GPU]
30720 bodies, total time for 10 iterations: 26.040 ms
= 362.407 billion interactions per second
= 7248.132 single-precision GFLOP/s at 20 flops per interaction