Reference model based ab-initio reconstruction

momo · May 5, 2021, 5:38pm

Dear cryosparc community:

Is there a way to provide an initial structure based volume to ab-initio reconstruction? In ab-initio parameter setting, the number of classes explains “Each class will be randomly initialized independently, unless an initial structure was provided, in which case each class will be a random variant of the initial structure”. However, there is no option to input an initial model.

I tried the heterogeneous refinement for an initial model based 3D classification, I used multiple copies of low-pass filtered 20 A volume of an initial model as the input volume to generate multiple classes of 3D reconstructed models. Some classes can reconstruct a model close to the input model, I then input particles from these classes to the reference-free ab-initio 3D reconstruction job, but it failed to generate the correct model. So I guess we cannot use the heterogeneous refinement for 3D classification purpose, because it’s very model biased.

Does anyone have any clue about reference model based 3D classification in cryoSPARC?

Thank you,
MM

olibclarke · May 5, 2021, 5:42pm

there used to be in cryosparc v0 IIRC, but it was never ported to v2 or v3

ArmandoPach · May 5, 2021, 7:24pm

Hi MM,

I was actually in your spot not so long ago.

From my experience (I work on quite heterogenous samples that have a very stable core and a lot of flexibility in the peripheral regions) ab-initio just gives one good initial model that is pretty accurate in the rigid parts but chops off the flexible parts, this no matter how many classes or layers you ask for/do.

I tried the heterogenous refinement approach and I managed to get many different conformations with the flexible parts there. I of course did this using an initial model that I knew closley resembled the structures I was going to get.

If you know more or less at least the general structure of your sample I would recommend you to go with the heterogenous refinement approach, it’s pretty good and in my case it matched previously published results.

Best,
Armando

momo · May 5, 2021, 7:36pm

Hi @olibclarke and @ArmandoPach,

Thanks for your clarification and suggestion! I have a reasonable initial model at ~6.3 A after multiple rounds of ab-initio reconstruction followed by homogeneous refinement and non-uniform refinement. I ended up only with ~13.5 k particles; I need to pick more good particles to improve the resolution. I suspect some good particles are lost during 2D classification step. The template-based particle picking using the 2D projections created with the model didn’t work well for my relatively small particles with C1 symmetry, it picked a lot junk particles. There are some 2D views that are very hard to distinguish between intact protein complex and broken protein particles. The 3D classification would help to select good particles that are in rare 2D views. Is there possibility that the reference-free ab-initio reconstruction may lose some good particles in bad classes? That’s why I wounder if the 3D classification based on an initial model would help maximize the selection of good particles?

The way I checked whether classification done by heterogeneous refinement was correct is by inputting the particles that resemble the initial model in hetero refine job to ab-initio 3D reconstruction job. The truly good particles should construct a 3D model even without an initial structure, but it didn’t work in my case: the ''good" particles selected based on hetero refine job failed to reconstruct a correct 3D model in the following ab-initio test.

Thanks,
MM

ArmandoPach · May 5, 2021, 9:14pm

Hi MM,

I might die on this uphill battle, but I am a strong advocant of doing the heavy particle cleaning using 3D classification.

What I do is that after particle cleaning I run a 2D job to clean the very obvious junk, after that I input everything into an heterogenous refinement with an initial volume that resembles my structure(s) (same as I will use for the actual 3D classification) asking for 3 classes, I then ask for 3 more classes from each class (using the same initial volume). I grab all the particles that don’t have streaking artificats in the visual part of the job in CS (I of course also visually inspect them; seeing stretching here is not a bad sign) and then input everything into a 2D job. I then select the good classes (the ones that have observable high resolution details) and start my 3D classification.

Truth to be told if it wasn’t for CryoSPARC’s fastness I wouldn’t be able to do this in a reasonable amount of time (it might not sound like it but it’s quite fast!), but this approach has lead me to retrieve WAY more particles than using the traditional approach (in one case I got almost 100k more, I had 200k initially) and it gives you a higher chance of snatching those rare views.

Have you used those outputs of your test and put them into an homogenous refinement? In reality once the high resolution permitting parameters come into place both of your routes might be the same, or at least quite similar (this because you normally low pass your input structures).

With some samples ab-initio might not do as good of a job as with other samples (in my case it only gave me 1 class with cut densities), so I prefer going with the inputting an initial volume route. As long as your expected structure is similar, or even downstreamed related to your sample (as is my case), I would highly recommend going with the heterogenous refinement route.

All the best,
Armando

momo · May 10, 2021, 4:35pm

Dear Armando,

I apologize for my delayed response. Thanks so much for your advice! They are very helpful, especially for me—a beginner in cryoEM and cryoSPARC. During the past several months, I collected several datasets on Krios but I’m still struggling to get a relatively high resolution structure; part of reason lies in the fact my intact complex falls apart during freezing but cross-linking helps to maintain more intact complex for sample prep. So I have to separate the broken particles from the intact particles, they share similar 2D views from certain angles. I want to extract as much intact/complete particles from these datasets as possible.

Could you please clarify some points that I didn’t fully understand?

“What I do is that after particle cleaning I run a 2D job to clean the very obvious junk”
Does particle cleaning here mean the curation of exposures/micrographs after CTF estimation or particle picking?

“after that I input everything into an heterogenous refinement with an initial volume that resembles my structure(s) (same as I will use for the actual 3D classification) asking for 3 classes”
Does actual 3D classification here mean 3D classification in another software that allows to input a reference structure?

“I then select the good classes (the ones that have observable high resolution details) and start my 3D classification”
Does 3D classification here mean ab-initio reconstruction in cryosparc?

“Have you used those outputs of your test and put them into an homogenous refinement?”
No, I didn’t do homogeneous refinement after heterogeneous refinement (3D classification with an low-pass filtered 20A-30A initial volume), I only did ab-initio after heterogeneous refinement and saw that ab-initio failed to resemble the initial volume.

One more question about the inspection of classification from heterogeneous refinement output volumes: do you open the three volumes together with your initial volume in Chimera and align all to the initial volume to see which one is closest to the initial volume, then select the best one for the next round heterogeneous refinement until the differences among 3 volumes are negligible?

Sincerely,
MM

txgrad3000 · July 1, 2021, 6:36pm

Wow, great comment here. Just wondering, do you ever change the parameters for the heterogenous refinement?

Thanks,
P

olibclarke · July 1, 2021, 8:48pm

Increasing batch size can be very helpful - 1000 per class may not be enough for very small or heterogeneous particles, I sometimes increase up to 20k and get improved results.