Reference motion correction memory issue

Hello, we are using reference motion correction (RBMC) and encountered a weird error. During the final pariticle motion correction step, it keeps failing and the error message is as shown:

.
Then, using binary search approach and we have located the single movie that caused this error. We have re-run the patch motion correction and suing particles from this single movie to perform RBMC and it kept failing:

image

I was wondering if there are anything we could do to resolve this error? The architecture of the node is as follows:
image


image

However, we don’t have sudo to this cluster node.

Thanks a lot.

Hello, an update to this issue:

We have excluded that corresponding movie, retrained the hyperparameters and found the movie caused this issue changed to another one. We are currently performing binary search again to locate this new movie.

Thanks a lot.

Hi @nym2834610, what format are your movies in (TIFF? EER?). How many frames and what resolution? Something is causing reference motion to need an enormous amount of memory - much more than it it probably should be needing. Is it possible that the offending files are corrupted on disk?

Thanks a lot for replying.

It is TIFF format, 40 frames and ~2.6 A res as shown below.
image

When opening in imod the movie seems fine. Since we have located this movie, and changing hyperparameters (excluding the previous movie already) caused another movie to fail, is it better that we send that movie/maps/hyper-parameter to you to diagnose this issue?

It may be - and thank you for offering. As one more sanity check, what happens if you run tiffinfo [filename] on the offending file, via the command line?