Streaky ab initio model

platorre · September 17, 2023, 8:28pm

Hi,

I am having trouble generating an ab initio model. As the iterations progress the model looks more and more streaky affecting the subsequent refinement steps:

Checkpoint 7:

Checkpoint 15:

The complex is a 100kDa heterodimer which we believe has some flexibility. 2D classes show definition of the different domains. This is a 60K particle stack after extensive 2D/3D classifications. Apix = 0.83. My settings are:
-1 class
-Initial/final resolutions: 4/8Å.
-Initial batch 1000.
-Final batch 1500.
The rest of options are unchanged. With initial models in the 7/9Å range and binned particles this effect does not happen, but information is lost.

I would be very thankful if someone can provide any advice/experience to overcome this situation.

Pedro.

DanielAsarnow · September 18, 2023, 7:08am

Does the initial model look like the 2D averages from different directions?

The streaks are a common symptom of overfitting, which is somewhat expected as ab initio does not have a half-set based overfitting control. Based on the images I don’t think there would be significant difference between these iterations when used as an initial model, with low pass filtering.

platorre · September 18, 2023, 11:48am

Hi @DanielAsarnow,

2D classes look clean and they are representative of the 3D model. Is there any way to improve the overfitting situation by tuning the ab-initio job?

Thanks!

DanielAsarnow · September 18, 2023, 6:03pm

Why would you need to? If the model looks roughly correct, you can use it for heterogeneous refinement, etc. and the initial low pass filter will get rid of the high frequency noise/streaks.

You can also use the earlier iteration that looks smooth - it seems to be the same structure as the later one with the streaks.

platorre · September 19, 2023, 2:24am

Thanks for the advice. I will try that way then.

MLiziczai · September 19, 2023, 9:43am

Do you have an preferred orientation issue? You could run 2-3 classes, maybe you still have some junk among the 60k particles. And then sort particles with heterogeneous refinement as mentioned by DanielAsarnow. Otherwise I also don’t see an issue with it.

platorre · September 20, 2023, 1:05pm

After some rounds of classification, it seems that the issue is solved. Now, I am trying to resolve the complex to high resolution and I am having the trouble that it gets stuck around 6Å giving very dotty maps with fragmented helices. I would like to ask if you have some tips for homogeneous and NU refinement for a small complex. Thanks again!

MLiziczai · September 22, 2023, 10:36am

The dotty/fragmented map sounds like overfitting to me during refinement. And it sounds like you still have not cleaned up your dataset completely. As I am working with a small protein too, I clean up my dataset with 3D classification/het.ref and ab-initio, not so much with 2D classification. You could take more particles, and run multiple rounds of extensive het.ref. using multiple references: volumes that are clear junks, volumes that are less clearly junk/semi ok, and one good volume. Once you are stuck, and loosing less then 10% of particles to the bad volumes, you could try to run a new ab-initio with multiple classes to see if you truly have a homogeneous sample, and refine the good volumes.

Play around with low-pass filtering (initial resolution) and final resolutions too.

This is at least what I do. I hope it helps.

platorre · September 22, 2023, 12:36pm

Thanks @MLiziczai. It is a very sensible strategy.

rposert · September 29, 2023, 7:38pm

Hi @platorre! I’m just checking in to see how the great suggestions above are working out for you, and if you have any more questions about your maps!

platorre · October 3, 2023, 6:53pm

Hi @rposert,

Thanks for following up on this topic. The main issue was the presence of junk particles. I was not able to remove them by heterogeneous refinement and/or 2D classifications. Then, my strategy has been using a good model and two bad classes for 3D classifications from the initial particle stack. Then, I moved the particles to Relion to obtain very clean 2D classes with the EM algorithm. With the clean classes I moved to CryosSPARC to continue with ab-initio and heterogeneous refinements. So far, it has improved the quality of the reconstructions and has minimized the streaky features of ab-initio that I was mentioning when I opened this thread.