Extensive workflow - Import movies fails

Dear all,

I just tried to run Extensive Workflow as suggested on:


Installed cryoSPARC version is 2.14.2. I did not change any of the default settings. The worker and master are on the same workstation. Ubuntu OS.

I started the workflow and it fails after a couple seconds at the first job Import movies:
The error is:

    [CPU: 195.2 MB]  Importing movies from /bulk0/data/EMPIAR/10025/data/empiar_10025_subset/*.tif
    [CPU: 195.2 MB]  Traceback (most recent call last):
    File "cryosparc2_master/cryosparc2_compute/run.py", line 82, in cryosparc2_compute.run.main
    File "cryosparc2_compute/jobs/imports/run.py", line 463, in run_import_movies_or_micrographs
    assert len(all_abs_paths) > 0, "No files match!"
    AssertionError: No files match!

It should download some data automatically but that does not seem to happen.
Any ideas what could be the issue?

If you require any additional logs or something I am happy to provide those.

Thank you!
Matic

Hey @eMKiso,

I believe you can get the job to download the movies automatically by deleting the text in the fields for both paths. Can you let me know if that works?

Edit: The automatic download will be available within a few days in the upcoming version of cryoSPARC, my apologies for the confusion.

In the mean time, you can download and extract the test data manually as per Step 3 here:

Then be sure to specify the movies and gain reference paths in the Extensive Workflow job description.

Hi,

yeah the empty paths didn’t work. There was a different error.

I actually did download the data an hour ago and ran the workflow with the downloaded files. I was not sure exactly which file to download and downloaded ‘ftp://ftp.ebi.ac.uk/empiar/world_availability/10025/data/14sep05c_aligned_196/14sep05c_c_00003gr_00014sq_00002hl_00005es_st.mrc’ since I could not find any .tif files as is suggested in the default path.

3 jobs finish successfully:

  1. Import
  2. Full-frame motion (M)
  3. CTF Estimation (CTFFIND4))

At the fourth job (Manually Curate Exposures) the workflow fails. The interactive job starts but at this point the ‘master job / workflow’ fails with error:

[CPU: 94.9 MB] Traceback (most recent call last):
File “cryosparc2_master/cryosparc2_compute/run.py”, line 82, in cryosparc2_compute.run.main
File “cryosparc2_compute/jobs/workflows/buildrun_bench.py”, line 174, in run_extensive_workflow
assert counts[‘total’] == 20
AssertionError

It fails before I get the chance to even load/open the interactive job.

I am now downloading the archive with the cryosparcm downloadtest command and I’ll try with that.

OK, tried now with the data that was downloaded by cryoSPARC and it went sucesfully past ‘Manually curate Exposures’. So the error above was connected with the fact that the data present was not what the workflow expected…

It seems be running fine now. I will report if there are any additional errors.
If not this is it, thank @stephan very much for the quick information!

Best,
Matic

Hey @eMKiso,

Thanks for updating! Yes, the Extensive Workflow job only works with the movies from the cryosparcm downloadtest dataset (which are a 20-movie subset of EMPIAR-10025 converted to .tif files).

Hi all,
so I am back to this after an upgrade to 2.15.
This Extensive workflow now works great on a single workstation. Good work!

Now we are facing a different problem when running Extensive Workflow for T20s (BENCH) (BETA) on a cluster.
It works fine until the ‘Extract From Micrographs’ step and here it fails. There it just ‘hangs’. No errors. Stays like this for hours. The log from the interface is here:

Previous steps where GPUs are used seem to work fine.
If I clone ‘Extract From Micrographs’ job that is created during the Extensive Workflow and run it with number of GPUs set to ‘0’ it finishes with any errors.

The GPUs on the cluster are Nvidia Tesla V100, Drivers 435.21, CUDA 10.1

Any ideas?

Hi @eMKiso, can you share the output of the internal job log for the Extract job? You can get this from the command line with this command:

cryosparcm joblog PXX JXX

Substitute PXX and JXX with the project and job numbers for the Extract job. Paste the full output here. You can press Control + C to stop the joblog.

Hi,

sure, below you can find the log. It is from the same job as the output in the previous post. I killed the job manually after some time.

[cryosparc@rm]$ cryosparcm joblog P21 J49

========= CRYOSPARCW =======  2020-10-07 10:22:33.070194  =========
Project P21 Job J49
Master rm Port 39002
===========================================================================
========= monitor process now starting main process
MAINPROCESS PID 23756
========= monitor process now waiting for main process
MAIN PID 23756
extract.run cryosparc2_compute.jobs.jobregister
***************************************************************
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
^C

Hi @eMKiso, thanks for sending that. Based on this, it definitely looks like the worker is getting stuck when it attempts to access the GPU. Could you send the job submission script in the Extract Job directory? It should be in the project directory at PXX/JXX/queue_sub_script.sh

Can you also send a screenshot of the cryoSPARC web interface with the full workspace open up the hanging Extract job?

Hi @nfrasser, sorry for the late reply.
Here is the 'queue_sub_script.sh:

#!/bin/bash
#SBATCH --job-name cryosparc_P21_J49
#SBATCH -c 2
#SBATCH --gres=gpu:1
#SBATCH -p grid
#SBATCH --reservation=KI
#SBATCH -t 1-00
#SBATCH --mem=8192
mkdir -p /data1/cryosparc/cryosparc_cache
export MKL_DEBUG_CPU_TYPE=5
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=/bin:/usr/bin:/usr/local/bin:/usr/local/cuda/bin:$PATH
singularity exec --nv /ceph/sys/singularity/cryosparc_worker.sif /opt/cryosparc2_worker/bin/cryosparcw run --project P21 --job J49 --master_hostname rm --master_command_core_port 39002 > /ceph/grid/data/cs_extensive_workflow/P21/J49/job.log 2>&1

I hope it helps!