3D classification without alignment

marino-j · July 23, 2021, 7:03am

@apunjani @olibclarke I see more and more occasions where Reviewers ask to try to resolve flexibility of a region by using other approaches than cryoSPARC, i.e. 3D classification in Relion without alignment. Is there any ratio behind such idea, and is there any reason to believe that such approach will perform better than local refinement, particle subtraction, 3D-variability analysis, etc, in cryoSPARC. Is there any algorithm and parameters in cryoSPARC that would do the same as 3D classification without alignment in Relion ? That would be so much easier than exporting cs files and importing in Relion, with all the troubles encountered in this approach.

I would very much appreciate if you could give me your opinion on this !

Many thanks !
Jacopo

olibclarke · July 23, 2021, 2:54pm

Hi Jacopo,

Yes, masked classification without alignments can resolve subtle, low occupancy states that sometimes cannot be picked up by 3D-VA. It is a complementary tool, that currently has no direct equivalent in cryoSPARC. In my opinion, the more diverse tools in different software packages that are available, and the easier it is to move between them, the better off the entire cryoEM community will be. I would definitely encourage you to take advantage of all the tools that are out there, it will definitely serve you well when dealing with challenging datasets.

Cheers
Oli

tarek · July 23, 2021, 9:50pm

In general I second Oli, however lately I find 3DVA extremely useful in dissecting heterogenous Datasets. You have to play a little with the masks but I don’t see that brute force 3d classification without masking is superior.
Moreover, you might miss subtle but important differences hidden behind large scale rearrangements e.g. subunit rotation in case of ribosomes.
Heterogeneous refinement in cryosparc is often not giving the desired precision or focus, this is the concern of your reviewers I believe.

Cheers,
Tarek

marino-j · July 27, 2021, 2:28pm

@olibclarke @tarek Thank you for your suggestions. Indeed, if moving from one platform to another one would be easier, the whole thing would be much less painful.
I doubt however that if a density is not present in the ab-initio, this will appear during 3D-classification in Relion. What do you think ? Many thanks

tarek · July 27, 2021, 2:57pm

I disagree, this is the case in many studies. It’s a matter of statistics due to averaging.
If you have only few particles sharing a certain feature this sometimes become visible only after extensive sorting.

olibclarke · July 27, 2021, 3:41pm

yeah I agree with tarek on this - many times important features not detected in multi class ab initio can be identified by careful 3D classification.

marino-j · July 27, 2021, 7:24pm

@olibclarke @tarek thank you for your valid feedback; as soon as I can stop crashing my head to convert that particles.cs file to a star file, I’ll give it a try !!!

user123 · July 29, 2021, 3:57am

Hi Jacopo, it’s definitely worth it. I found 3DVA was very informative in deciding where to place a mask for 3D classification without alignments, but cryoSPARC just can’t seem to give good particle sets based on the movements (I may be doing something wrong). 3D classification can find very small movements and give nice discrete particle stacks for further refinement. In my experience, heterogeneous refinement in cryoSPARC is useful to remove bad particles to increase resolution of a final stack or to separate moderately dissimilar states. Yet it seems to latch onto very stable core of the protein and masks out regions with movement, which is often where the action is.

I’ve been getting best (& fast) results using a tight box around my particle with binning to 1.25A. It usually takes a few iterations to get a feel for the right settings, so getting it to go fast helps a lot. I’ll test T parameter at 80, 200, or 400 (without alignments), into 8 or up to 20 classes. I like to extract in cryosparc with fourier crop, ab initio, refine, csparc2star, import it all to relion + custom mask (or start with csparc refine mask) and spend a few days (or weeks) running 3D classification. It reveals things that cryosparc just cannot find (yet). See another endorsement here.

If you need any help with pyem, feel free to shoot me a DM and I’ll do my best to help.

marino-j · July 29, 2021, 7:36am

@user123 Hi, and thanks a lot for the very kind and detailed message. Very appreciated. I’ll give it a try immediately now that I sorted it out with pyem ! Cheers

Adrian · July 29, 2021, 12:36pm

Hi, why do you use such a huge T parameter as 200 or 400? Is the mask very small?
Thank you

user123 · July 29, 2021, 1:30pm

It can give good results if the data allows it. Every dataset is different. I’ve found when I have lots of particles (>200k) and high SNR, then high T is no problem and classes come out pretty clean, but lower quality datasets might benefit from much lower T (like 80). I’m also looking for very small changes in a pretty stable protein.

Adrian · July 31, 2021, 2:33am

I believed the usual T parameter in without alignment focused classifications was around 20.
Increasing from the normal 4 for classifications with alignment.

twg · July 31, 2021, 2:11pm

@apunjani @stephan Are there any plans to implement (focused) 3d classification w/o alignment in cryoSPARC? Having more tools for tackling heterogeneity available - in addition to the very useful heterogeneous refinement and 3DVA jobs - will always be useful.

marino-j · July 31, 2021, 3:42pm

Has someone ever compared the output of cryoDRGN with cryoSPARC 3D variability ?

Guillaume · August 3, 2021, 7:36am

I haven’t done a serious comparison, but I did use both 3DVA and cryoDRGN on the same project that showed purely continuous conformational heterogeneity. Here is the preprint, if you want to take a look: https://doi.org/10.1101/2021.06.18.448936
I’m afraid the figures won’t tell you much about the difference between results from the two approaches. But we are currently addressing the reviews and hopefully the final paper will be online sometime this fall. When this happens, you’ll also be able to watch the videos from both 3DVA and cryoDRGN, and they are interesting.

One thing I like a lot about cryoDRGN is the graph traversal: unlike 3DVA, it is not restricted to generating principal component traversals (that often involve some kind of interpolation between observed states), and seems to give more faithful representations of conformational changes.

marino-j · August 4, 2021, 1:32pm

@Guillaume thanks for the input, I’ll wait to see the movies of your paper !

kpahil · November 17, 2021, 11:41pm

Does anyone have any suggestions on how to set up Relion 3D classification-without alignments (ex. how much padding to use in masks, how much to low-pass filter (I’d be starting with a highly curated particle stack already), etc.? Which variables are most worth changing? The tutorials I’ve read have focused more on 3d classification at early stages of processing; are there any good resources you’d recommend? Thank you!

user123 · November 18, 2021, 2:49am

Hi @kpahil in my experience it can take iterative testing of different settings to find the optimal classification scheme. It will depend on the degree of conformational change and I’ve found using 3DVA is highly informative in showing where to place the mask for Class3D and perhaps how many classes to use (depending on how rare the alternative conformation(s) is/are). I usually stick with a mask generated from a molmap’d pdb at 8-15A, choose ini_threshold slightly larger than the protein, extend ~4 A, width soft edge 8A. I’ll usually run Class3D with different T levels and k and play around with these values to see how they affect the output classes. Every dataset is different (number of total particles, ice thickness, number of bad particles0 and there’s not really a one size fits all approach other than iterative testing, even for the same protein.

kpahil · November 18, 2021, 4:37pm

Thank you, that was very helpful!

Lan · January 17, 2022, 9:49pm

Hi USER123
“I usually stick with a mask generated from a molmap’d pdb at 8-15A, choose ini_threshold slightly larger than the protein, extend ~4 A, width soft edge 8A.”
Could you explain this please? what do you mean by “choose ini_threshold slightly larger than the protein” ?
which program did you "extend ~4 A?
Thanks, Lan