Allocating multiple GPUs for a job

open

#1

Hello all,

We have encountered an issue where only one GPU [0] is allocated for jobs at cryoSPARC, which is not optimal for Heterogeneous Refinement as it is very memory dependant. We have two GPU (0 and 1) that are identical and both are confirmed to be enabled, following the commands we have found on this page.

So, is there any way of allocating both cards for a single job, considering both are identical, recognized and enabled?

Thank you very much in advance.


#2

Did you define CUDA_VISIBLE_DEVICES? The default cluster submission script (cluster_script.sh) contains the following setup code:

available_devs=""
for devidx in $(seq 0 15);
do
    if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then
        if [[ -z "$available_devs" ]] ; then
            available_devs=$devidx
        else
            available_devs=$available_devs,$devidx
        fi
    fi
done
export CUDA_VISIBLE_DEVICES=$available_devs