All job cannot continue after the node restart

PeterXTH · September 23, 2020, 9:00pm

Hi all,
Does anyone know how to deal with the following errors? After the crash of the node, I cannot run any job to continue. All of them, new jobs or failed jobs which I cleaned and rerun, showed the same error message.
Thanks,
Tinghai

Error message:
Traceback (most recent call last):
File “cryosparc2_master/cryosparc2_compute/run.py”, line 82, in cryosparc2_compute.run.main
File “cryosparc2_compute/jobs/curate_exposures/run.py”, line 135, in run
rc.log(‘Loaded info for %d exposures’ % (len(micrographs_dset)))
TypeError: object of type ‘NoneType’ has no len()

nfrasser · October 13, 2020, 7:03pm

Hi Tinghai, were you able to figure this out? This is happening because the “exposures” Input Group is not available for the “Curate Exposures” job. This could happen in one of the following cases:

The “exposures” input group was not specified when you created the job in the Job Builder
The parent Import or Motion Correction job that provided the exposures was cleared, deleted or became corrupted during the crash

Try the following:

If the parent job is still available, try to re-run it or re-import the exposures.
Clear the failing “Curate Exposures” job and enter the “Job Builder” mode
Remove the existing “exposures” input group (if any)
Re-connect the “exposures” from the job in Step 1

Let me know if you run into any trouble with this,

Nick

PeterXTH · October 14, 2020, 2:36am

I can only re-run the jobs from the very beginning-importing exposures. I am not sure whether the crash will cause parent job corruption since we identified the malfunction of memory. We are trying to replace them.
Thanks,
Tinghai