Limiting the number of particles per 2D class

apunjani · February 13, 2020, 5:47pm

For very large datasets like this, we usually recommend doing a multi-stage process:

do ab-initio runs with 1, 3, and 6 classes simultaneously, but set the Num particles to use to 100,000 so that the job does not see the entire dataset. Generally you don’t need all 1M particles to find the heterogeneous classes. For the ab-initio reconstruction, default parameters should be okay but if you are working with a very small protein, increase the Maximum resolution and Initial resolution to high resolutions.
from the ab-initio runs, you can select the run that gave the best spread of different conformations (at low resolution) and connect all of the 1M particles and the volumes from ab-initio (say 6-class) to a heterogenous refinement job. This job will process the 1M particles much faster than ab-initio reconstruction but will be able to resolve 6 (or more if you chose) different conformations.
Take the best classes from hetero refinement that all contain the same particle in different conformations (excluding the junk classes) and combine all the particles together as the input of a consensus homogeneous refinement. This will give a single refinement with orientations against a single volume.
Take the refinement output particles and use them to run 3D variability. This will resolve continuous and discrete conformational changes. You can then use 3D variability display to separate clusters, or create intermediate reconstructions along a flexible dimension.