Cannot Use Queue Modal to Queue to a specific GPU

closed

#1

Same error using Muti-GPUs. Jobs can be finished when I used only one GPU.
And I can’t select the specific GPU.


[v2.12.2]Patch Motion/Patch CTF job error at the last two images
#2

Hi @FengjiangLiu,

It seems like you can’t select a specific GPU because the GPU information is not populated in the database. The function that populates this information auto-runs when you start cryoSPARC. If for some reason the function fails, it will fail silently. You can re-run the function itself and decode the error logs (by monitoring cryosparcm log command_core).

In a shell, run: cryosparcm cli "get_gpu_info()" && cryosparcm log command_core

You might see a traceback- if you know what the problem is, go ahead and fix it, otherwise post it here and I can suggest some next steps.


#3

I ran the command you have given. But I don’t know how to fix it. Could you give me some advice?


#4

Hi @FengjiangLiu,

Looks like some sort of SSH error. From that machine, if you run ssh spuser@spgpu, do you get a Host Verification request? Or any other type of error?


#5

Hi, @sarulthasan


Just as same as former.

I noticed that when I upgrade cryoSPARC, many old software were upgraded too. If it is possible that something was changed about GPU or GPU detection.


#6

Hi @sarulthasan

I got the same problem on my ‘standalone’ machine. the sshuser@hostname can lead to the verification request.


#7

Hi @FengjiangLiu,

You need password-less SSH access to this machine for the function to work properly.

Ensure that SSH keys are set up for the cryosparc_user account to SSH between the master node and the worker node without a password. From https://cryosparc.com/docs/reference/install/#remote-access:

Set up SSH keys for password-less access (only if you currently need to enter your password each time you ssh into the compute node).

  1. If you do not already have SSH keys generated on your local machine, use ssh-keygen to do so. Open a terminal prompt, and enter:

    ssh-keygen -t rsa -N "" -f $HOME/.ssh/id_rsa
    

    Note: this will create an RSA key-pair with no passphrase in the default location.

  2. Copy the RSA public key to the remote compute node for password-less login:

    ssh-copy-id remote_username@remote_hostname
    

    Note: remote_username and remote_hostname are your username and the hostname that you use to SSH into your compute node. This step will ask for your password.


#8

Hi @hxn,

Thanks for reporting this. We’ll add a fix to ensure SSH is not used on standalone instances.


#9

Hi @ sarulthasan

I can select a specific GPU now. Thanks a lot.

But there is still a problem in using Muti-GPUs in Patch Motion Correction.

And when I ran “cryosparcm cli “get_gpu_info()” && cryosparcm log command_core”, the output shows like this.


#10

Hi @FengjiangLiu,

Glad you were able to get the GPU queuing to work. Regarding the other error, it was a bug in cryoSPARC v2.12.0 and 2.12.2. A patch has been released (v2.12.4) to fix this issue, as well as a few others. To update, run cryosparcm update


#11

Thank you very much! I’m going to try!