Enable multi-gpus for a single job on workstation

Hi,

On my workstation:
cryoSparc v.2.12.4
OS: CentOS 7.5

I set the env variable:

[cryosparc_user@dubochet ~]$ export -p
declare -x CUDA_VISIBLE_DEVICES=“0,1”

And check all the gpus configued for cryoSparc to make sure.

In [1]: {t[‘hostname’]:t[‘resource_slots’][‘GPU’] for t in cli.get_scheduler_targets()}
Out[1]: {u’dubochet’: [0, 1]}

However, the Queue Dialog does not display the list of gpus to select. It only says “GPUs: 2”

As a result, I cannot not specify multi-gpus to run. The job only runs with one default gpu 0. Could anyone show me how to allocate multi-gpus to a single job (Refinement)? Can multi-gpus share their memory to increase memory capacity to handle large box-size (similar to Relion)?
Thanks.

Hi @mizz_em,

The Homogenous Refinement will only use 1 GPU to run. You cannot select more than one GPU to allocate this job to.

On the other hand, it looks like your “Run on specific GPU” page is missing your GPU information. Can you go to a terminal and run cryosparcm cli "get_gpu_info()" && cryosparcm log command_core and paste the output?

Hi, I have a similar problem so I’m jumping in here. See the same issue with not being able to schedule to a GPU even though it cryosparc seems to see all 4 of our GPUs. We see the same dialog with 4 GPUs recognized on our workstation but no check boxes and the “You need to select…” error message.

cryosparcm cli "get_gpu_info()" returns “None” so I assume that’s the start of my problem?

Thanks for the help!

Hi @fredrward,

Can you send screenshots of the page that doesn’t show checkboxes? Please note you’ll still be able to run GPU jobs by queuing to lanes instead. Can you also send the output of cryosparcm cli "get_gpu_info()" && cryosparcm log command_core

Thanks for getting back to us, sorry for the delay.

Here’s the screen showing 4 GPUs but no way to select them:

Here’s the outputs you requested: Thank you!
cryosparcm cli "get_gpu_info()"
None
cryosparcm log command_core

[POST-RESPONSE-THREAD ERROR  2020-01-20 15:07:49.103059  at  get_gpu_info_run ]
-----------------------------------------------------
Traceback (most recent call last):
  File "cryosparc2_command/command_core/__init__.py", line 145, in run
    self.target(*self.args)
  File "cryosparc2_command/command_core/__init__.py", line 872, in get_gpu_info_run
    value = subprocess.check_output(full_command, stderr=subprocess.STDOUT)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 216, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 394, in __init__
    errread, errwrite)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 1047, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
-----------------------------------------------------
**custom thread exception hook caught something
**** handle exception rc
Traceback (most recent call last):
  File "cryosparc2_compute/jobs/runcommon.py", line 1490, in run_with_except_hook
    run_old(*args, **kw)
  File "cryosparc2_command/command_core/__init__.py", line 145, in run
    self.target(*self.args)
  File "cryosparc2_command/command_core/__init__.py", line 872, in get_gpu_info_run
    value = subprocess.check_output(full_command, stderr=subprocess.STDOUT)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 216, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 394, in __init__
    errread, errwrite)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 1047, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Traceback (most recent call last):
  File "cryosparc2_compute/jobs/runcommon.py", line 1490, in run_with_except_hook
    run_old(*args, **kw)
  File "cryosparc2_command/command_core/__init__.py", line 145, in run
    self.target(*self.args)
  File "cryosparc2_command/command_core/__init__.py", line 872, in get_gpu_info_run
    value = subprocess.check_output(full_command, stderr=subprocess.STDOUT)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 216, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 394, in __init__
    errread, errwrite)
  File "/EM/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/subprocess.py", line 1047, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Hi @fredrward

cryoSPARC v2.13 has some fixes regarding the get_gpu_info() function. Can you update, and let us know if the problem persists?

Hi – thanks for the quick update. Things seem to be working as expected now!

1 Like