Reference based motion correction error ====== Job process terminated abnormally

Hello @wtempel ,

Following up on our discussion about the Reference-based motion correction error I am having for one of my projects - I have run the test using Extensive validation with the T20s subset, as you suggested.
Interestingly, it finished ok with 4GPU // default settings for the memory. Shall I send you the current report of it?
I am still not sure why my actual project keeps falling. Can we somehow derive the reason comparing the accomplished (T20 test) and failed (actural) reference-based motion correction reports?

Thank you.

Kind regards,
Dmitry

Please can you inspect the command core log for errors related to this event

cryosparcm log command_core
cryosparcm filterlog command_core -l ERROR

and let us know what you find.
The timestamps may help you narrow down the search (see guide).

1 Like

the command cryosparcm filterlog command_core -l ERROR shows nothing.

The cryosparcm log command_core

cryosparc_user@cryoem1:~$ cryosparcm filterlog command_core -l ERROR

cryosparc_user@cryoem1:~$ cryosparcm log command_core
2024-01-30 16:31:29,032 dump_job_database INFO | Updating job manifest…
2024-01-30 16:31:29,033 dump_job_database INFO | Done. Updated in 0.00s
2024-01-30 16:31:29,033 dump_job_database INFO | Exported P6 J43 in 0.02s
2024-01-30 16:31:29,033 run INFO | Completed task in 0.022545814514160156 seconds
2024-01-30 16:31:29,033 run INFO | Received task layout_tree with 4 args and 0 kwargs
2024-01-30 16:31:29,054 run INFO | Completed task in 0.020574331283569336 seconds
2024-01-30 16:31:29,054 run INFO | Received task layout_tree with 4 args and 0 kwargs
2024-01-30 16:31:29,074 run INFO | Completed task in 0.019555091857910156 seconds
2024-01-30 17:12:09,751 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-30 18:12:10,511 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-30 18:35:20,282 dump_job_database INFO | Request to export P6 J44
2024-01-30 18:35:20,282 dump_job_database INFO | Exporting job to /data/dmitry/cryosparc/CS-test-motion-cor/J44
2024-01-30 18:35:20,283 dump_job_database INFO | Exporting all of job’s images in the database to /data/dmitry/cryosparc/CS-test-motion-cor/J44/gridfs_data…
2024-01-30 18:35:20,434 dump_job_database INFO | Writing 212 database images to /data/dmitry/cryosparc/CS-test-motion-cor/J44/gridfs_data/gridfsdata_0
2024-01-30 18:35:20,434 dump_job_database INFO | Done. Exported 212 images in 0.15s
2024-01-30 18:35:20,434 dump_job_database INFO | Exporting all job’s streamlog events…
2024-01-30 18:35:20,438 dump_job_database INFO | Done. Exported 1 files in 0.00s
2024-01-30 18:35:20,438 dump_job_database INFO | Exporting job metafile…
2024-01-30 18:35:20,439 dump_job_database INFO | Creating .csg file for particles_0
2024-01-30 18:35:20,441 dump_job_database INFO | Creating .csg file for hyperparameters
2024-01-30 18:35:20,443 dump_job_database INFO | Done. Exported in 0.01s
2024-01-30 18:35:20,443 dump_job_database INFO | Updating job manifest…
2024-01-30 18:35:20,444 dump_job_database INFO | Done. Updated in 0.00s
2024-01-30 18:35:20,444 dump_job_database INFO | Exported P6 J44 in 0.16s
2024-01-30 18:35:20,455 set_job_status INFO | Status changed for P6.J44 from running to completed
2024-01-30 18:35:20,457 app_stats_refresh INFO | Calling app stats refresh url http://cryoem1.itqb.unl.pt:39000/api/actions/stats/refresh_job for project_uid P6, workspace_uid None, job_uid J44 with body {‘projectUid’: ‘P6’, ‘jobUid’: ‘J44’}
2024-01-30 18:35:20,460 app_stats_refresh INFO | code 200, text {“success”:true}
2024-01-30 19:12:11,440 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-30 20:12:12,426 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-30 21:12:12,732 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-30 22:12:13,439 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-30 23:12:13,662 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 00:12:14,310 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 01:12:15,110 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 02:12:15,706 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 03:12:16,547 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 04:12:17,365 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 05:12:18,146 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 06:12:18,997 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 07:12:19,296 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 08:12:20,134 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 09:12:20,834 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 10:12:21,510 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 11:12:22,098 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 12:12:22,755 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 13:12:23,455 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 14:12:24,245 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
2024-01-31 15:12:25,246 background_worker INFO | License does not have telemetry enabled; will re-check license in 1 hour.
Waiting for data… (interrupt to abort)

Shall I resturt the faluing Reference based morion correction protocol while having this script running - cryosparcm log command_core
?

thank you

Even though the “Unable to get log” issue may be (loosely) related to the problems you experienced with reference based motion correction, the issues should probably be investigated separately. The command

cryosparcm log command_core

initially only shows the tail of the log. If you press an interrupt key sequence, like Ctrl+C, you can navigate the log (how?).
Log rotation may have occurred after you observed Unable to get log. You may be able to identify the relevant log file by running the command

grep -i "unable to get log" /path/to/cryosparc_master/run/command_core.log*

initially only shows the tail of the log. If you press an interrupt key sequence, like Ctrl+C, you can navigate the log (how?).

It seems to be very long - so I pressed the Ctrl+C and then can roll with the mouth wheel up without reaching the beginning of the document.

And about the Unable to get log -
this file seems to be not there any longer

Given that cryosparc_master/ is your working director, omitting the leading / from the command may help:

grep -i "unable to get log" run/command_core.log*

(explanation)

hello @wtempel ,

I tried the command

grep -i "unable to get log" run/command_core.log*

but it shows nothing.

Just to be sure - I have copied this file and sent you the link for it directly.

Please let me know what do you think.

Kind regards,
Dmitry

Dear @wtempel and colleagues,

After the inspection of command_core.log file the “unable to get log” error was not observed.
Maybe it was some temporal issue since later it was possible to download the file.

Still my issue with Reference based motion correction error from my dataset remains but got modified a bit due to a bit different input parameters.

So, I used the and with the default settings and 3 GPU. It shows me the regular error - ====== Job process terminated abnormally.

Now it died on the Checkpoint 15 rather than 19. But this time in the input settings for Hyperparameter search thoroughness I used Fast option instead of extensive one.

Does anyone have any clue?
Shall I try to run this protocol just with 1 movie?

Kind regards,
Dmitry

[Edited: corrected command typo rail [sic] → tail]
@Dmitry (With apologies in case you already mentioned those items in this long thread)
Please can you describe

  • movie raw pixel size
  • movie frame dimensions
  • in case of EER data
    • number of fractions (during import)
    • upsampling factor (during import)
  • box size
  • average number of particles per exposure
  • output of the command (with correct project and job UIDs)
    cryosparcm joblog PX JY | grep -v "Unknown field with tag 65002" | tail -n 50
    
1 Like

hello @wtempel

Please find the value below.

Please can you describe
• movie raw pixel size – 0.72948 A/pix
• movie frame dimensions – 4096x4096
• in case of EER data
o number of fractions (during import) – I was using the default 40
o upsampling factor (during import) - 1
• box size - 384
• average number of particles per exposure – 150
• output of the command (with correct project and job UIDs)
cryosparcm joblog PX JY | grep -v “Unknown field with tag 65002” | rail -n 50

cryosparc_user@cryoem1:~/cryosparc/cryosparc_master/bin$ cryosparcm joblog P2 J161 | grep -v “Unknown field with tag 65002” | rail -n 50

Command ‘rail’ not found, did you mean:
command ‘sail’ from deb bsdgames (2.17-29)
command ‘rmail’ from deb exim4-daemon-heavy (4.95-4ubuntu2.4)
command ‘rmail’ from deb exim4-daemon-light (4.95-4ubuntu2.4)
command ‘rmail’ from deb postfix (3.6.4-1ubuntu1.1)
command ‘rmail’ from deb courier-mta (1.0.16-3build3)
command ‘rmail’ from deb masqmail (0.3.4-1build1)
command ‘rmail’ from deb rmail (8.15.2-22ubuntu3)
command ‘rtail’ from deb ruby-file-tail (1.2.0-1)
command ‘tail’ from deb coreutils (8.32-4.1ubuntu1)
command ‘rails’ from deb ruby-railties (2:6.1.4.1+dfsg-8ubuntu2)
command ‘rain’ from deb bsdgames (2.17-29)
command ‘mail’ from deb mailutils (1:3.14-1)
Try: apt install

cryosparc_user@cryoem1:~/cryosparc/cryosparc_worker/bin$ cryosparcm joblog P2 J161 | grep -v “Unknown field with tag 65002” | rail -n 50
Command ‘rail’ not found, did you mean:
command ‘mail’ from deb mailutils (1:3.14-1)
command ‘sail’ from deb bsdgames (2.17-29)
command ‘tail’ from deb coreutils (8.32-4.1ubuntu1)
command ‘rmail’ from deb exim4-daemon-heavy (4.95-4ubuntu2.4)
command ‘rmail’ from deb exim4-daemon-light (4.95-4ubuntu2.4)
command ‘rmail’ from deb postfix (3.6.4-1ubuntu1.1)
command ‘rmail’ from deb courier-mta (1.0.16-3build3)
command ‘rmail’ from deb masqmail (0.3.4-1build1)
command ‘rmail’ from deb rmail (8.15.2-22ubuntu3)
command ‘rtail’ from deb ruby-file-tail (1.2.0-1)
command ‘rails’ from deb ruby-railties (2:6.1.4.1+dfsg-8ubuntu2)
command ‘rain’ from deb bsdgames (2.17-29)
Try: apt install

Info from one movie

Sincerely,
Dmitry

I apologize for including a typo in the command, the command should have been tail. Please can you try:

cryosparcm joblog P2 J161 | grep -v "Unknown field with tag 65002" | tail -n 50

[Edited: corrected command quoting]

1 Like

thank you @wtempel for the correction.

Do I understand correctly that I have to exicute this command in
cryosparc/cryosparc_worker/bin folder?

regards,
Dmitry

This particular command does not require to be executed inside a specific working directory, as long as cryosparcm is in your $PATH. If cryosparcm is not in your $PATH, you may run, based on an earlier screenshot:

~/cryosparc/cryosparc_master/bin/cryosparcm joblog P2 J161 | grep -v "Unknown field with tag 65002" | tail -n 50

Note that an earlier version of the command, in addition to the aforementioned typo, included incorrect quote characters.

1 Like

ok.

So here is the answer of the command above -

cryosparc_user@cryoem1:~/cryosparc$ ~/cryosparc/cryosparc_master/bin/cryosparcm joblog P2 J161 | grep -v “Unknown field with tag 65002” | tail -n 50
grep: (standard input): binary file matches
27.455161189 — TOTAL —

refmotion worker 3 (NVIDIA RTX 4500 Ada Generation)
Min BFGS iterations: 0
Max BFGS iterations: 47
AVG TIME (s) SECTION
0.000000000 Cache IO
0.065241929 Optimize trajectories
0.002401450 Compute cross-validation
0.000132663 Save trajectory
0.067776042 — TOTAL (Cros-Val) —

ElectronCountedFramesDecompressor: reading using TIFF-EER mode.

refmotion worker 0 (NVIDIA RTX 4500 Ada Generation)
scale (alpha): 10.362117
noise model (sigma2): 41.347626
TIME (s) SECTION
0.000037261 sanity
20.308419294 read movie
0.029121457 get gain, defects
0.164850908 read bg
0.014893193 read rigid
0.583974543 prep_movie
0.931565666 extract from frames
0.013961920 extract from refs
0.000151276 adj
0.000000020 bfactor
0.327736996 rigid motion correct
0.002255391 get noise, scale
22.376967924 — TOTAL —

refmotion worker 4 (NVIDIA RTX 4500 Ada Generation)
Min BFGS iterations: 0
Max BFGS iterations: 72
AVG TIME (s) SECTION
0.000000000 Cache IO
0.236995906 Optimize trajectories
0.008456548 Compute cross-validation
0.000220522 Save trajectory
0.245672977 — TOTAL (Cros-Val) —

ElectronCountedFramesDecompressor: reading using TIFF-EER mode.

/home/cryosparc_user/cryosparc/cryosparc_worker/cryosparc_compute/jobs/motioncorrection/mic_utils.py:95: NumbaDeprecationWarning: The ‘nopython’ keyword argument was not supplied to the ‘numba.jit’ decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See Deprecation Notices — Numba 0+untagged.2291.gc6da269.dirty documentation for details.
@jit(nogil=True)
/home/cryosparc_user/cryosparc/cryosparc_worker/cryosparc_compute/micrographs.py:563: NumbaDeprecationWarning: The ‘nopython’ keyword argument was not supplied to the ‘numba.jit’ decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See Deprecation Notices — Numba 0+untagged.2291.gc6da269.dirty documentation for details.
def contrast_normalization(arr_bin, tile_size = 128):
cryosparc_user@cryoem1:~/cryosparc$

Kind regards,
Dmitry

hello @wtempel and all,

I keep thinking about my issue.

Can the problem run the reference motion cor from the different pieces of the datasets (please see the scheme below)? If yes, how do you overcome this problem?
a) Do motion cor separately for each Dataset
b) Extract the part of the particles from each Dataset separately.
c) Combine/align and rerun the Reference-based motion correction.

scheme of current data analysis

Thank you.

Sincerely,
Dmitry

Hi, I have exactly the same issue at my site.
I did have successful run of reference based motioncorrection before, but for some dataset even using the exactly the same settings I just keep getting stuck.

1 Like

Hello @jybjybjjyb ,
I am now testing the steps I described above.

I report later if my idea worked out :slight_smile:

Kind regards,
Dmitry

Hi @Dmitry, sorry you’re still having trouble with this dataset. When you get to checkpoint #19, is that the start of FCC computation in your case? Or is the hyperparameter search still ongoing at that point?

1 Like

Hello Harris, @hsnyder ,

I believe that is the one below.
The proces dies + - on between 20-30%.
I will send you the report of this process in direct message.

Thank you.

Sincrerely,
Dmitry

@jybjybjjyb , the update - my idea to work separately with the each dataset, process with reference based motion correction and then combine did not work.

Dear all,

I was also wondering if there is a tool to check/verify the EER movies.
Just to be sure that there is no corrupted one that can cause such behaviour.

Kind regards,
Dmitry