There seems to be a mistaken assumption that helical symmetry can be determined if one can generate an ab initio asymmetric reconstruction and then search this for the correct helical symmetry. Please disabuse me of this notion if I am wrong. The problem is that the correct or incorrect helical symmetry has already been locked in when one generates the asymmetric reconstruction, and the asymmetric reconstruction is not unique. That is, just as one can take an ensemble of images and generate different 2D averages from this same ensemble (including one that will look like Albert Einstein), one can generate different 3D volumes from this ensemble with the assignment of different Euler angles and translational parameters to each image. This appears to happen in a stochastic (or non-deterministic) way in cryoSparc, perhaps involving a random number generator at some stage. Thus, we were very impressed when cryoSPARC determined the correct helical symmetry from a data set of helical filaments. But rerunning everything with the same parameters, the correct “solution” was not even in the top 20 found. This is obviously because the correct symmetry had been locked into the first asymmetric reconstruction, while an incorrect symmetry had been locked into the second volume. The reason that there is no simple solution to this problem is not a failure of current methods, it is simply a mathematical ambiguity such that there are multiple solutions that are indistinguishable (in terms of residuals, etc.) at some finite resolution. However, only one of these solutions will show correct features at higher resolution, and that is what makes them distinguishable. It is like finding A and B when all one knows is that A+B=17. There is no unique solution.
Hello @egelman,
Thank you for the post! Definitely cryoSPARC’s ab-initio and refinement algorithms are susceptible to this fundamental limitation when reconstructing helical lattices. Fully a-priori helical reconstruction without symmetry knowledge is experimental and empirically can work in some cases (as you mentioned) but can’t be guaranteed to work in general. Incorrect solutions can be converged to both in ab-initio reconstruction and refinement. In the EMPIAR-10031 case study, we presented our recommended workflow of asymmetric ab-initio / refinement, followed by symmetry determination, as an example of a workflow that we’ve seen to work on some datasets, but this does not work generally for all datasets for the exact reason you have pointed out. Because of this, structures should always be inspected visually before any further processing or analysis is done. And of course there are tools in other software packages that users have found useful for large-scale symmetry determination, e.g. the SPRING packages’ SegClassReconstruct program (based on determining symmetry from a single 2D class).
Your comment on run-to-run variability is also behaviour that we’ve seen too – ab-initio and refinement both use random seeds (which can be specified as parameters to each job) to govern density initialization and batch selection, so these seeds can be specified to control all the random choices that the algorithm makes. In practice however, accumulated floating point non-determinism on the GPU can still cause some run-to-run variability even if the same seed is specified.
In our helical processing documentation we have notices for the above issues, and we also include some suggestions for parameters of either job that tend to generally help (e.g. in ab-initio, forcing the use of high-resolution information early on sometimes produces better results). Most of these are highlighted in the guide page titled Helical Symmetry in cryoSPARC, but these are mainly suggestions that we’ve seen to help in our limited experimentation so far.
Best,
Michael