Hi all,
I am wondering if anyone has experience running cryosparc on virtual GPUs as available through MIG.
So far I was not able to communicate between cryosparc and the instances.
Best,
Tarek
Hi all,
I am wondering if anyone has experience running cryosparc on virtual GPUs as available through MIG.
So far I was not able to communicate between cryosparc and the instances.
Best,
Tarek
Hi @tarek
Have you assigned unique ID for each MIG device with UUID?
and then add the environment variables?
Cheers,
qitsweauca
Yes.
user@gpu:~$ nvidia-smi -L
GPU 0: A100-PCIE-40GB (UUID: GPU-90134985-26d8-39db-81bd-61f9a31864fe)
MIG 2g.10gb Device 0: (UUID: MIG-GPU-90134985-26d8-39db-81bd-61f9a31864fe/3/0)
MIG 2g.10gb Device 1: (UUID: MIG-GPU-90134985-26d8-39db-81bd-61f9a31864fe/4/0)
MIG 2g.10gb Device 2: (UUID: MIG-GPU-90134985-26d8-39db-81bd-61f9a31864fe/5/0)
GPU 1: A100-PCIE-40GB (UUID: GPU-0d407aac-ca34-f203-ab5b-b57b784d5074)
and here
user@gpu:~$ CUDA_VISIBLE_DEVICES=MIG-GPU-90134985-26d8-39db-81bd-61f9a31864fe/3/0,MIG-GPU-90134985-26d8-39db-81bd-61f9a31864fe/4/0,MIG-GPU-90134985-26d8-39db-81bd-61f9a31864fe/5/0 /srv/public/cryosparc/cryosparc_worker/bin/cryosparcw gpulist
Detected 1 CUDA devices.
id pci-bus name
---------------------------------------------------------------
0 0000:01:00.0 A100-PCIE-40GB MIG 2g.10gb
---------------------------------------------------------------
am I missing something?
I just found out that this is a current limitation of CUDA11, only a single instance can be assigned to CUDA. Seems like we have to wait for future CUDA releases…
### [CUDA Device Enumeration](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/#cuda-visible-devices)
MIG supports running CUDA applications by specifying the CUDA device on which the application should be run. With CUDA 11, only enumeration of a single MIG instance is supported.
CUDA applications treat a CI and its parent GI as a single CUDA device. CUDA is limited to use a single CI and will pick the first one available if several of them are visible. To summarize, there are two constraints:
1. CUDA can only enumerate a single compute instance
2. CUDA will not enumerate non-MIG GPU if any compute instance is enumerated on any other GPU
Note that these constraints may be relaxed in future NVIDIA driver releases for MIG.
Has anyone gotten MIG to work with CUDA 12?
All the best
Michael