Optimised per-particle scaling from previous job causes crash

rbs_sci · December 5, 2023, 9:09am

Hi CryoSPARC team,

Another one! Maybe new…?

Once I’ve got good per-particle scales in a dataset, should it need to be reset and re-run each refinement?

I’ve got a helical dataset (the one I will show is a TMV sanity check, not the real data) where I can hit physical Nyquist (<1.9 Å)… that’s not the issue. The issue is pre-Global CTF and Local CTF estimation, I had good per-particle scales, so I switched off reset/re-calculate… and the first iteration of helical refinement always crashes with a size zero error. Doesn’t matter whether I use the output particles from the previous refinement, the particles from Global CTF or the particles from Local CTF, the crash is the same.

As long as I either (a) reset the per-particle scale or (b) reset and re-calculate per-particle scaling, the helical refinement will succeed without issue (and the TMV sanity check also hits physical Nyquist, which is nice…) using 8K sampling from the EER data gives expected FSC.

Why does taking per-particle scales from a previous job and not resetting them cause this error?

Thanks in advance.

edit: Resetting but not re-calculating per-particle scaling results is a slight loss of reported resolution (not statistically significant) in the final reconstruction compared to both reset and re-calculate.

mmclean · December 11, 2023, 6:39pm

Hi @rbs_sci,

This is quite strange; I don’t think we’ve seen it before! Based on where the error is happening, could you post the plots from the 0th iteration of the failing job? Particularly the real space density plots of both the volume and the mask would be informative.

Could you also let us know what input was used for the mask, as well as which parameters were set to non-default values?

Best,
Michael

rbs_sci · December 12, 2023, 12:44am

Hi @mmclean,

Initial slices:

[CPU: 3.64 GB Avail: 156.06 GB] -- THR 0 BATCH 500 NUM 3269 TOTAL 14.826919 ELAPSED 69.869445 --
[CPU: 6.50 GB Avail: 151.99 GB] Processed 25074.000 images in 72.021s.
[CPU: 7.00 GB Avail: 150.77 GB] Computing FSCs...
[CPU: 7.00 GB Avail: 150.82 GB] Using full box size 400, downsampled box size 200, with low memory mode disabled.
[CPU: 7.00 GB Avail: 150.85 GB] Computing FFTs on GPU.
[CPU: 6.94 GB Avail: 155.40 GB] Done in 2.734s
[CPU: 6.94 GB Avail: 155.39 GB] Will do local processing this iteration.
[CPU: 6.94 GB Avail: 155.38 GB] Using Filter Radius 197.500 (1.863A) | Previous: 18.400 (20.000A)
[CPU: 9.49 GB Avail: 136.30 GB] Non-uniform regularization with compute option: GPU
[CPU: 9.49 GB Avail: 136.33 GB] Running local cross validation for A ...
[CPU: 14.88 GB Avail: 122.16 GB] Local cross validation A done in 35.441s

Output from iteration 000 (input to iteration 1):

It’s immediately obvious that it’s gone nuts (even before the plots appear) because the first fit runs to almost-Nyquist (“Using filter radius…”) when on a “good” run it will fit to ~2.4A.

I’m allowing it to recalculate a mask on the fly (no static mask attached).

Custom parameters were:
Helical twist estimate (degrees): 22.034 (calculated from previous run)
Helical rise estimate (Angstrom): 1.384 (calculated from previous run)
Do symmetry alignment: OFF (using already aligned map)
Resolution to begin local searches of helical symmetry: 0 (disable symmetry searching)
Fix search grid: OFF (do not want)
Use non-uniform refinement: YES (want NU Refine)
Number of extra passes: 2
Reset input per-particle scale: OFF (previously calculated)

Jobs with identical settings except for the per-particle scale complete and give excellent results.

Thanks for your help.

mmclean · December 12, 2023, 7:31pm

Thanks @rbs_sci for the information. Based on this, my suspicion is that there are invalid values (perhaps zeros or NaNs) in the per-particle scale from the upstream refinements. If you have CryoSPARC Tools installed, would you be able to verify this for me by running a simple script?

import numpy as n
from cryosparc.tools import CryoSPARC

cs = CryoSPARC(host="<hostname here>", base_port=40000)
assert cs.test_connection()

project_number = "PXXX" # project number here
job_number = "JYYY" # job number here (upstream of the helical refinement)

project = cs.find_project(project_number)
job = cs.find_job(project_number, job_number)
particles = job.load_output("particles") # you may need to modify the name of the output argument here

print(f"Min alpha: {n.amin(particles['alignments3D/alpha'])}")
print(f"Max alpha: {n.amax(particles['alignments3D/alpha'])}")
print(f"Any zeroes? {n.any(n.isclose(0.0, particles['alignments3D/alpha']))}")
print(f"Any NaNs? {n.any(n.isnan(particles['alignments3D/alpha']))}")

Best,
Michael

rbs_sci · December 13, 2023, 1:37am

Will do, @mmclean, thanks.

edit:
Output from failing job:
Min alpha: 0.0
Max alpha: 0.0
Any zeroes? True
Any NaNs? False

The identical job which has per-particle scaling reset reports:
Min alpha: 1.0
Max alpha: 1.0
Any zeroes? False
Any NaNs? False

The identical job which has per-particle scaling reset and recalculated reports:
Min alpha: 0.035231757909059525
Max alpha: 1.3187696933746338
Any zeroes? False
Any NaNs? False

A second job which has per-particle scaling reset and recalculated reports:
Min alpha: 0.036313191056251526
Max alpha: 1.322471261024475
Any zeroes? False
Any NaNs? False

And the originating job (where scales should be taken from) reports:
Min alpha: 0.05159826949238777
Max alpha: 1.310579538345337
Any zeroes? False
Any NaNs? False

Min alpha seems extreme? I don’t remember any of the micrographs looking particularly high or low intensity, but I’ll run through them again to check. Everything is 16-bit MRC. Can CryoSPARC/(tools) dump out say, the 1% of particles at each extreme for a sanity check?

mmclean · December 13, 2023, 2:44pm

@rbs_sci Thank you for this;

Just to confirm, this is the job directly upstream of the helical refinement, from which the particles were connected?

This is quite suspicious. Since the job failed after iteration 0, there was at least one iteration complete, so I think this might be our issue. Somehow the scales are reset to zero by the job, and are not preserved from the upstream job. We’ll try to reproduce this.

Michael

rbs_sci · December 13, 2023, 2:51pm

Correct.

Thanks for looking at this.

mmclean · December 13, 2023, 6:35pm

@rbs_sci

Thanks for finding this. We’ve recreated this internally, it is indeed a bug particular to helical refinement, specfically in the case where “Re-estimate greyscale level of input reference” is False. We’ve noted a fix for a future release.

In the meantime, the most straightforward way to workaround this is to enable “Re-estimate greyscale level of input reference”. Note that this will keep the scales intact, and will apply a single multiplicative factor on the initial volume to match the greyscale range of the particles. This only affects iteration 0’s results (in future iterations, the reference is created directly from the particles, so naturally has the correct grayscale).

Best,
Michael

rbs_sci · December 14, 2023, 12:08am

Thanks for confirming, @mmclean and a nice easy workaround as well.

mmclean · May 7, 2024, 7:35pm

Dear @rbs_sci,

Thanks for the help in debugging this. We have fixed this issue as of CryoSPARC v4.5, released today; now scales are not reset to 0 if initial greyscale estimation is disabled.

Best,
Michael