How to manually mark job as complete when the GUI won't let you

mg643 · May 24, 2021, 1:07pm

Bit of an interesting one here! I’ve been running Patch CTF estimations for 6200 micrographs, and the network at our university keeps crashing. In jobs like 2D classification etc this is usually no issue as the job keeps going despite saying it’s failed, however for CTF and a few others like Motion Correction, when the error ‘Job is unresponsive - no heartbeat received in 30 seconds’ pops up the entire job crashes! (It’d be really nice if you could fix this so the job carries on when the connection resumes)

For CTF what I’ve been doing is marking the job as complete when the failure occurs, and it was working fine as can be seen by the screenshot (you can’t run the next job until you mark the current failed one as complete). However I’ve checked cryoSPARC today and 3/5 of the CTF are now marked as failed (despite me manually marking them as complete so I could carry on yesterday). So my question is how can I click them back to complete? I can’t keep CTF correcting the incomplete exposures until the failed jobs are seen as complete, as annoyingly cryoSPARC won’t recognise them (instead it just queues the job as it has done for J130). And downloading the incomplete exposures is pointless too, as inputs for CTF need to be from another job (as far as I’m aware?).

At the moment I’m just carrying on and doing a Filament tracer on the first 2 CTF jobs that haven’t been changed to fail, but I really would like to be able to use all of my micrographs and not just a sub select.

stephan · May 25, 2021, 3:04pm

Hi @mg643,

You can try setting the CRYOSPARC_HEARTBEAT_SECONDS parameter to be more than the default 60 seconds (maybe 180 or 300) to allow the server to wait longer for the connection to resume before marking the job as failed.
You can do this by adding the line:
export CRYOSPARC_HEARTBEAT_SECONDS=300 to the end of cryosparc_master/config.sh,
then restart cryoSPARC.

Also, to force a job to start without waiting for previous jobs to complete, you can run the command line function:
cryosparcm cli "enqueue_job(project_uid='P3', job_uid='J130', lane='<cryosparc_lane_name>', no_check_inputs_ready=True)"
Though this method isn’t reliable, as if the previous job is actually still running, there’s a chance that it will overwrite some of the files that the current job is trying to read, causing it to fail.