3D refinement quality worse than ab-initio

Hi all,

I’m struggling with a novel problem (for me) where my protein density map looks nicer in ab-initio compared to any 3D refinements.

My troublesome dataset is of a ring-shaped protein, ~80kDa in size with a bound ligand and surface glycans present. I collected roughly 19,000 micrographs on a Titan Krios G3i equipped with a Gatan K3 BioQuantum detector (pixel size 0.51, dose 60 e/Å2, defocus -0.4 to -1.6). Standard pre-processing pipeline looked fine - denoising showed good particle density, CTF gave no surprises and particle picking found a few hundred particles per micrograph (both blob picker and template picker gave similar results of about 7M picked particles).

I extracted 5.6M particles with a 384px box size Fourier cropped down to 192px. I didn’t spend much time on 2D classification as I like to clean up most trash in the following 3D steps, but I could start to see secondary features already which was promising. However I noticed that even in the best looking 2D classes, the background was quite noisy, shown as grey streaks, and there seemed to be a load of the classic “honeycomb” overfitting in many classes. Anyway, after some 2D cleaning, ab-initio showed a sensible class with a good particle distribution which converged after ~300K particles.


Next for Heterogenous refinement, I generated 4 dummy volumes from my excluded 2D classes and kept my single ab-initio class for a 5-class run. The junk fell into the 4 dummy classes while the nicest particles remained in the single nice class. I repeated this 3 times until I was left with ~400K of the nicest particles. This is where things started to go downhill - each of the Het-refinements didn’t really improve the quality of the best class and the resolution seldom tickled 6-7Å.

Performing NU-refinement on the “good” particles produced similar results with, what I believe is, the phase randomisation being hit in the corrected GSFSC.

In the Het-refinement steps, I increased batch size per class to 5000 which didn’t make much of a difference. For NU-refinement, I’ve tried to mask with both tight and loose masks while also turning off dynamic masking to no avail. Also increasing the number of extra final passes to 5, ignoring tilt, trefoil and tetra while setting fit trefoil and fit tilt to false made no improvements. I’m running out of ideas of how to troubleshoot further - have I missed something glaring in one of the steps? Nothing in the raw data seems to suggest the data are really this poor in quality. I’m happy to provide any more info if needed and I’d really appreciate any suggestions, thank you!

Kind regards,

Kuba

1 Like

Hi Kuba,

As you mentioned: Next for Heterogenous refinement, I generated 4 dummy volumes from my excluded 2D classes and kept my single ab-initio class for a 5-class run. The junk fell into the 4 dummy classes while the nicest particles remained in the single nice class. I repeated this 3 times until I was left with ~400K of the nicest particles.

May I ask how you ran the 3 rounds of heterogeneous refinement? Which particle set was used as the input for each round?

Have you tried multi-ab initio?

Cheers,

C

Hey Kuba,

Do you think it’s possible that you’re running too many heterogeneous refinement steps? Based on your “Typical Non-uniform refinement result” figure, your cFAR score is 0.17, which likely suggests a preferred orientation.

I’ve previously been told that if you perform multiple heterogeneous refinement steps sequentially, after removing all junk particles, the algorithm may eventually begin sorting classes based on orientation rather than structural differences; for example, class00 may contain mostly top views while class01 contains mostly side views.

Did you notice any difference in the quality of the non-uniform refinement result when you performed it on the “best class” from the very first round of heterogeneous refinement?

Best,
Wil

Hi Kuba,

If I had to guess, I think your particle stack is still too contaminated by non-particles. I’ve yet to see a protein that shows even sampling across almost all viewing angles like your viewing distribution shows. Many of your 2D classes look like just noise to me. I’d go back to 2D classification and do more particle curation and see if that helps. You’re right to be wary of classes that show that honeycomb pattern; there’s likely still quite a bit of random noise included in those classes.

3 Likes

Hi Kuba. This looks promising and probably just needs further cleaning to remove junk and non-centred picks.

Your 2D classes appear to have quite a lot of particles per class, 40 to 70 thousand so I wouldn’t be surprised if there are still bad picks in those classes. I would recommend doing another round or two of 2D classification turning off “Force max over poses/shifts”, 200 classes, 40 online iteration, 3 full iterations, and 200 for “batchsize per class”. Maybe a tighter mask as well with help with neighbouring particles.

Good luck!

1 Like

Hi C,

Yeah good question - so for the iterative het-ref cycles, I would take the particles and volume from the single best class and use this along with another 3 or 4 different junk classes in the next het-ref job.

I did try a 5-class ab-initio, but it really just gave me very similar classes of my ring-like protein. Would you recommend more ab-initio classes? Thanks for the help!

Hi Salmen,

Thanks for the help! I’ve not had anyone explain multi het-ref to me after cleaning junk like this, but I can see how that can happen for sure, thanks. That being said, my particle distribution plot looks suspiciously good for particles that may contain no junk.

I actually haven’t compared the NU-ref results of, say, the 1st vs. 3rd het-ref run - that’s a great suggestion, I’ll go and have a look thank you!

Hi @rabdella @TMcCorvie

Thank you both for the tips - I agree, I have never seen my particle angle distribution plots looking so nice! That should have been a giveaway for the presence of junk. I’ll go back and run a few cycles of 2D cleaning with those parameters applied. Thanks again!

1 Like

Yeah this was absolutely the fix - a couple rounds with these parameters really sharpened up my density maps in further refinement rounds! Thank you so much @TMcCorvie and @rabdella - this is a good lesson, for me at least, that I shouldn’t rely too heavily on 3D jobs for junk clean up.

Cheers!
Kuba

1 Like