Seed-facilitated 2D classification?

Hello, I’d like to know whether seed-facilitated 2D classification can be carried out for cryoSPARC v4 and how to do it. Could someone kindly explain it?

Thanks for your kind help!

No, it cannot currently.

Hi @suqi,

Are you referring to using predefined templates to sort particles in 2D-classification? What are you attempting to achieve with such process?

Best,
Kye

Seems to be what is described here:

https://rdcu.be/dXU1j

Basically combining simulated particles generated from an initial ab initio with the original particles in order to improve the stability of 2D classification for heterogeneous samples. Unfortunately the method is only briefly outlined here, and the reference which is cited in the text does not seem to be the right one - so the details on the number of simulated particles, exactly how they were generated, and at what step they were discarded do not seem to be present.

@olibclarke thanks for the ref.

@suqi It might be possible to do something like this using the “Simulate Data” job in CS or the VirtualIce python package. I would venture to say that for heterogeneous samples, it might be easier to sort in 3D using Heterogeneous refinement and generating some maps from models as inputs. Then do ab-initio and homogeneous/NU-ref if worried about an Einstein from noise issue.

Best,
Kye

Thanks for your kind reply!

I tried testing this using the simulate data job in CS, simulating particles derived from a homogeneous reconstruction job of an ab initio, and it actually worked quite well - even just using simulated particles with res limited to 15Å helped classification quite a lot for a small particle that otherwise gives mixed results in 2D (as judged by 2D classification of the resulting selection after removing the simulated data using particle sets).

Having said that, I would be cautious with this approach and make sure you know what you are doing - you need to make sure you get rid of simulated data after the seed-based 2D, and I would not try this using seeds that aren’t generated from the data at hand, for fear of possible template bias. And obviously this will bias the results towards the simulated data - so not suitable for separating many species in a heterogeneous mixture, but perhaps useful for selecting additional particles matching a rare state identified by prior classification.

And I do agree with @kstachowski that for many (most!) cases heterogeneous refinement will probably be more useful, but I wouldn’t entirely discount this approach as one tool in the toolbox for edge cases (small, featureless membrane proteins in particular).

3 Likes