Tips for heterogeneous refinement with small, low SNR particles

Cameron · November 12, 2024, 6:32pm

I’ve found a ton of really useful tips on here for processing small particles in 2D, ab-initio jobs (thank you!). I’m working on a case right now where I am trying to use heterogeneous refinement to separate my true particles from junk for a ~75 kDa protein using a semi-reasonable ab-initio class and three “junk” classes. The protein has fairly strong features from a top view, but not in the side views. I’m worried I am throwing out side views because of this. Any tips on how to set up the heterogeneous refinement to maximize the precision in cases like this?

Das · November 13, 2024, 2:02pm

Hi,
75 kDa protein and if its a C1 symmetry will make processing challenging. I would recommend first use the right box size for particle extraction. Then use the cryosparc 2d classification protocol for small particle. Do increase iteration 60-80 depending on number of particles. Make 4-5 ab initio classes with resolution 8-12A. Take your probable class for NU refinement.
Next with the refined volume as a input as one volume and rest 4 from Ab initio output, carry out a heterogeneous refinement with all particles used in 2d classification ( not select 2d) until there is no improvement in your main volume. This will take care of all the other views that we may miss while selecting 2d. Then finally do NU refinement, Nu-defocus and ctf and finally sharpen with either cryosparc or DeepEM enhancer. You can also tk half maps directly from final refinement to Phenix map sharpening.
Hope this helps.
Best

cbeck · November 13, 2024, 7:25pm

Hi Cameron,

When I do heterogeneous refinement, I sometimes find that the “good” class still looks noisy or has some overfitting artifacts. After each iteration of heterogeneous refinement, it can help to redo ab-initio to get a better quality map for the next round of classification. When the junk classes start to account for <5% of the particles, I find that it’s usually worth redoing 2D classification with 100-200 classes, 40 O-EM iterations, 5 final iterations, and a batch size of 200. Since most of the junk has been removed, I usually get more diverse classes of the protein, making it lot easier to identify rare views. Hope this helps!

Best,
cbeck

giax · February 17, 2025, 2:39pm

Hi,

I bump on this interesting post with nice tips.

I also have a small particles, 150 Kda, monomeric and flexible.

when you say “Next with the refined volume as a input as one volume and rest 4 from Ab initio output” I have some difficulties in understanding what should I do…

should I refine each ab-initio volume against all particles and take the best one ?
but how do I know is really the best one ? resolution, number of particles ? orientations ?
sorry for the naive questions, i am still a beginner in cryoEM.

thanks in advance and best regards,

GIA

Das · February 18, 2025, 4:24am

Hi GIA,

Let’s say I have performed Ab-initio with five classes. One of them will be my most probable volume, while the other four will be junk (may not be always!). I take the most probable volume to NU/Homo refinement (1).

In the next step, I refine each of the junk volumes individually (2-5). Then, I proceed with heterogeneous refinement, where I input particles and the volume from my most probable volume(1) along with the four junk volumes (2-5).

This process can be repeated until there is no further improvement in the resolution of your volume(1). Throughout this process, you should observe that your most probable volume becomes cleaner and increases in resolution, while the junk particles gradually decrease with multiple iterations with nonfeatured volumes(2-5). At that point, you can stop and proceed with refining your best probable volume. Once you refine and do global/local CTF refinement, this becomes your consensus refinement. From here you can proceed with 3D classification/3DVA/3D flex etc according to the problem you have.

I hope this clarifies the approach.

giax · February 18, 2025, 10:16am

Hi,

Thanks a lot for the reply.

yes looks much clearer now. so to resume:

I perform ab-initio 5 volumes
I refine each ab-initio volume individually
I re-input everything into heterogeneous refinement.

the only problem is that in my case my 5 ab-initio classes look all similar… how can I say which is the most probable/best one ?

GIA

Das · February 19, 2025, 3:01am

Hi,
Inspect all the ab initio volume and if you see all are similar then there may not be a need for heterogeneous refinement. If you still aren’t sure then maybe you can try stopping the ab initio in the initial few iterations and mark the job complete and then take the junk volumes and throw them to heterogeneous refinement against your best-looking class and proceed.
It is not necessary that all proteins will have heterogeneity.
In this case, proceed to consensus refinement and perform 3D classification.

Cameron · March 1, 2025, 12:36am

Update: In my case the heterogeneous refinements benefitted (a bit) from increasing the batch size to 5000, turning on force hard classification, and using a spherical mask

wangyan16 · August 13, 2025, 6:23am

Hi Cbeck,

Thanks for sharing. When you redo ab-initial and heterogeneous refinement, do you only keep “good” class? For example, after 1st round heterogeneous refinement, I will use the best class particle to redo ab-initial, and perform 2nd round heterogeneous refinement with new ab-initial maps using best class particles from 1st round heterogeneous refinement. Is this correct?

Thank you!

YW