Hey guys,
So we have seen some inconsistent behaviour with cryoSPARC in cluster mode. When we let SLURM submit the job, sometimes it works and sometimes it doesn’t (see error output below). But when we run the job on a specific lane (without SLURM) it runs fine with no errors. Even more puzzling is the error output as it seems to be complaining about EER fractions when this isn’t even the right datatype (as EER was never used in this dataset).
A few details:
Latest version of cryoSPARC 4.6.2.
No glaring errors in:
croysparcm log command_core
or
cryosparcm log command_rtp
Traceback (most recent call last):
File "/mnt/jobfs/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 2304, in run_with_except_hook
run_old(*args, **kw)
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 136, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 137, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/jobs/class2D/newrun.py", line 620, in cryosparc_master.cryosparc_compute.jobs.class2D.newrun.class2D_engine_run.work
File "cryosparc_master/cryosparc_compute/engine/newengine.py", line 566, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.read_image_data
File "/mnt/jobfs/cryosparc/cryosparc_worker/cryosparc_compute/ioengine/cmdbuf.py", line 87, in wait
raise IOError('\n\n'.join(errs))
OSError: I/O error, mrc_readmic (1) line 914: Invalid argument
The requested frame/particle cannot be accessed. The file may be corrupt, or there may be a mismatch between the file and its associated metadata (i.e. cryosparc .cs file).
I/O request details:
filename: /mnt/jobfs/ssd/instance_m3q000.massive.org.au:39001/links/P32-J253-1742217419/d6345b320d56a00ab9278ffb9825346a39389d46.mrcs
data type: 0x10
frames: [211:212]
eer upsample factor: 2
eer number of fractions: 40