Patch motion correction

I am running a job on Patch motion correction it seems the job is stalling. I have imported 5059 movies and the files says that 5050 of 5059 movies have been processed at the top line of the overview file and then below 7 more have been processed. However, there has not been any output of new files for well over 12 hours. The log file shows sending heartbeat at this point.

In the output box to the top right, for micrographs is shows Count:0. Is there way that I can stop the job and just take the files that have been processed to this point and move them into the Patch CTF estimation job?

Also, the job seems to still be running. the motioncorrected directory looks like it has all of the .tif files converted to aligned .mrc , and .npy files but there is no .cs outputs yet after 12 hours from the last processed file.

Hi @George,
This issue happens due to a silent failure in any one of the worker GPUs in your patch motion job. In versions prior to v2.12, that error was not caught and so the other GPUs would continue processing but the job would stall at the end waiting for the dead worker to complete.
This condition has been fixed in v2.12.4 (though some users have reported a different issue that we cannot yet reproduce… under investigation)

Thanks-I will try again with the newest version.

I have installed the new version and now it runs fine. thanks

I am running on v2.15.1-live_privatebeta and have run into the same problem. 5315 of 5316 micrographs processed and then the job stalls.

Hi
I am facing an issue with Patch MotCorr. Its getting stopped in the middle of the run showing the follow error. Could you please help me out?

Traceback (most recent call last):
File “cryosparc2_compute/jobs/runcommon.py”, line 1490, in run_with_except_hook
run_old(*args, **kw)
File “/mnt/raid0/bfl-group/Saif/cryosparc3/cryosparc2_worker/deps/anaconda/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “cryosparc2_compute/jobs/pipeline.py”, line 153, in thread_work
work = processor.process(item)
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 118, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 123, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc2_compute/blobio/mrc.py”, line 114, in read_mrc
data = read_mrc_data(file_obj, header, start_page, end_page, out)
File “cryosparc2_compute/blobio/mrc.py”, line 77, in read_mrc_data
data = n.fromfile(file_obj, dtype=dtype, count= num_pages * ny * nx).reshape(num_pages, ny, nx)
ValueError: total size of new array must be unchanged

Outputting partial results now…

Traceback (most recent call last):
File “cryosparc2_worker/cryosparc2_compute/run.py”, line 78, in cryosparc2_compute.run.main
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 312, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi
AssertionError: Child process with PID 15784 has terminated unexpectedly!

Many thanks in advance
Saif

Hi @Saif,
Which version of cryoSPARC are you running?

Hi, thanks for your response.

It is v2.12.4.

Saif