Job failed, how to debug?

Hi cryoSPARC developers,

Thanks for rolling out the v2! Looking forward to trying it out. However, my 1st impression is that the UI is not as intuitive as v1.

Right now, I am stuck at step 2 of the tutorial dataset. The full-frame motion correction job failed to run.

Command ‘[‘ssh’, u’whuang@taylorGPU1.case.edu’, ‘nohup’, u’/Data_HDD2/Programs/packages/cryosparc_v2/cryosparc2_worker/bin/cryosparcw run --project P2 --job J4 --master_hostname taylorGPU1.case.edu --master_command_core_port 39002 > /Data_SSD1/scratch/tutorials/J4/job.log 2>&1 & ‘]’ returned non-zero exit status 255

Could you help me figure out what the problem is? Thanks! Please let me know if you need further information from me.

Thanks,
Wei

It turns out that I need to setup password-less access even on local workstation. Is this the right solution?

Hi Wei,
I am getting the same error. Did this fix the problem and if so what steps did you take to set up password-less access on your workstation?
Thanks,
Taylor

Hi Wei,

We’re glad you’re excited about the new release!

The installation procedure assumes your default shell is bash. Running a job may fail if it is not. Also, setting up ssh keys is a requirement between master and worker nodes.

You can find steps to do this in the installation docs, found here (step 5).

Thanks,
Stephan

Hi Stephan,

After using for one day, I really like this new release. The interactive tuning for particle trimming is very convenient! Before this, I have to use awk to filter the star file in relion and go through many iterations. Now it is much more straightforward!

Also thanks for the information on the installation! This problem has been fix.

Thanks,
Wei

1 Like

Hi Stephan,
I ran into a related issue when going through the tutorial. I did setup paswordless ssh, but the workstations are not using the standard ssh port (22). Is there a way to define which port cryosparcv2 should use to log into the nodes? The actual error is:

Launching job on lane default target epeius.qb3.berkeley.edu …
License is valid.
Running job on remote worker node hostname epeius.qb3.berkeley.edu
Failed to launch! 255
ssh: connect to host epeius.qb3.berkeley.edu port 22: Connection refused

Best,
Simon

Hi Simon,

I believe setting up your ssh config file is your best option. Take a look at this article I found for an easy walkthrough. This article might also be helpful.

Best,
Stephan

EDIT: We have now included instructions on how to do this in the installation docs, found at the end of this section.

Hi Stephan,
thanks a lot for that! It worked. Just for your information: I also tried to specify the ‘–sshstr’ option with the correct port when connecting the worker to the master node, but this didn’t solve the issue. Specifying the port in the ssh config file did.

Thanks,
Simon