Patch motion correction only processes some micrographs

kbisikalo · May 25, 2023, 9:33am

Hi everyone,
I am processing a big dataset (33k movies), collected on the Falcon 4i. While attempting to run Patch motion correction, I repeatedly encounter the same problem - job either fails, half-way, or completes, with a half of the micrographs.
I have tried restarting cryosparcm, updating to the latest version, but the problem persists.
The jobs are running on a cluster.
I would appreciate any advice on this matter.

The error message on the failed micrographs:
Error occurred while processing J53/imported/013886693208130898513_FoilHole_30446358_Data_29377140_29377142_20230430_104130_EER.eer
Traceback (most recent call last):
File “/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_worker/cryosparc_compute/jobs/pipeline.py”, line 60, in exec
return self.process(item)
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 324, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
AssertionError: Job is not in running state - worker thread with PID 3463378 terminating self.

Marking J53/imported/013886693208130898513_FoilHole_30446358_Data_29377140_29377142_20230430_104130_EER.eer as incomplete and continuing…

And some messages from the log:

================= CRYOSPARCW ======= 2023-05-25 03:01:24.188944 =========
Project P6 Job J73
Master puhti-login12.bullx Port 40402

===========================================================================
========= monitor process now starting main process at 2023-05-25 03:01:24.188997
MAINPROCESS PID 3463303
========= monitor process now waiting for main process
MAIN PID 3463303
motioncorrection.run_patch cryosparc_compute.jobs.jobregister
========= sending heartbeat at 2023-05-25 03:02:15.437499

Running job on hostname %s 2006450-gpu 2d
Allocated Resources : {‘fixed’: {‘SSD’: False}, ‘hostname’: ‘2006450-gpu 2d’, ‘lane’: ‘2006450-gpu 2d’, ‘lane_type’: ‘cluster’, ‘license’: True, ‘licenses_acquired’: 1, ‘slots’: {‘CPU’: [0, 1, 2, 3, 4, 5], ‘GPU’: [0], ‘RAM’: [0, 1]}, ‘target’: {‘cache_path’: ‘/run/nvme/job_15912870/data’, ‘cache_quota_mb’: None, ‘cache_reserve_mb’: 10000, ‘custom_var_names’: [‘command’], ‘custom_vars’: {}, ‘desc’: None, ‘hostname’: ‘2006450-gpu 2d’, ‘lane’: ‘2006450-gpu 2d’, ‘name’: ‘2006450-gpu 2d’, ‘qdel_cmd_tpl’: ‘scancel {{ cluster_job_id }}’, ‘qinfo_cmd_tpl’: ‘sinfo’, ‘qstat_cmd_tpl’: ‘squeue -j {{ cluster_job_id }}’, ‘qstat_code_cmd_tpl’: None, ‘qsub_cmd_tpl’: ‘sbatch {{ script_path_abs }}’, ‘script_tpl’: ‘#!/usr/bin/env bash\n \n#SBATCH --account=Project_2006450\n#SBATCH --job-name=cryosparc_{{ project_uid }}{{ job_uid }}\n#SBATCH --time=2-00:00:00\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:v100:{{ num_gpu }},nvme:3600\n#SBATCH -p gpu\n#SBATCH --mem=0\n#SBATCH -o {{ job_dir_abs }}/cryosparc{{ project_uid }}{{ job_uid }}.out\n#SBATCH -e {{ job_dir_abs }}/cryosparc{{ project_uid }}_{{ job_uid }}.err\n \necho Local scratch directory path is: $LOCAL_SCRATCH\n \nexport CUDA_VISIBLE_DEVICES=0,1,2,3\nexport CRYOSPARC_SSD_PATH=$LOCAL_SCRATCH\n \n{{ run_cmd }}\n’, ‘send_cmd_tpl’: ‘{{ command }}’, ‘title’: ‘2006450-gpu 2d’, ‘tpl_vars’: [‘job_dir_abs’, ‘num_cpu’, ‘run_cmd’, ‘job_uid’, ‘project_uid’, ‘cluster_job_id’, ‘num_gpu’, ‘command’], ‘type’: ‘cluster’, ‘worker_bin_path’: ‘/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_worker/bin/cryosparcw’}}
ElectronCountedFramesDecompressor: reading using TIFF-EER mode.
TIFFReadDirectory: Unknown field with tag 65002 (0xfdea) encountered

========= sending heartbeat at 2023-05-25 12:05:05.792681
Unknown field with tag 65002 (0xfdea) encountered

ctueting · July 21, 2023, 9:11am

We have the exact same error/warning in our job.logs. It’s only obvious, if you check actually the log file, as in the cryoSPARC webinterface, there is no information about this. It’s actually one warning / movie, so it’s not an edge case.

Do @wtempel or any other developer needs additional information on this?

I actually could trace the issue back to last year by checking patch motion logs. Actually, it looks like the issue starts to appear, at the point we got the Falcon4 camera and with this, the EER file format.
In an *mrc acquisition on the Falcon3.

We’re still using cryoSPARC 3.3.2. If I can send you any files (an exemplary eer, any logs) just let me know.
Also, as this issue was so long unseen, can I assume, that if the job finishes successfully, there is no issue in the downstream processes?

EDIT: We also so this message on data, recorded at the EMBL. So my idea is, this is an issue with the eer to mrc conversion during import, and the current issue we’re facing with the patch motion, are not related to this. During the day of acquisition the mic was very unstable and we had electromagnetic interference, which I think is harder for the algorithm during patch motion.

hsnyder · September 6, 2023, 3:45pm

I can confirm that the warning Unknown field with tag 65002 (0xfdea) encountered is “normal”. It is emitted by libTIFF when opening EER files. In some future release, we will suppress it, as it serves no purpose, but it doesn’t indicate anything is wrong.