Worth exploring data set further?

DarioSB · August 21, 2024, 1:18pm

Dear all,
i have a data set of a 120kDa protein where I need your advice if it is worth analyzing further or rather make a new data set:
1734 movies at 0.85A/px in counting mode, patch motion corr, patch ctf
1341 mics after exposure curation
2D classification of 1mio template picked particles
2D classification of 120k selected promising classes

Apart from the orientation bias, which is obviously a problem, there is only one class (the second one with 7879ptcls), which I consider sufficient quality. All other look bad to me.

Would you try to analyze further and if yes, how?

Best
Alex

Cameron · August 21, 2024, 8:41pm

This looks pretty good to me! Although your top row is a lot of the same orientation, it looks like you have other views represented in the other rows (albeit not aligned as well). You might want to use all of your reasonable classes (first two rows plus first five classes in the third row) to train the topaz picker to get more of the rare views. Then carefully pick a few reasonably-resolved 2D classes with different views for ab-initio, followed by iterative heterogeneous refinement of all of your particles against your ab-initio model plus 3-4 noise volumes to clean up your particle stack, as has been outlined elsewhere in this forum.

Curious what others would try as well

carlos · August 22, 2024, 10:53am

I agree with Cameron, this doesn’t look bad at all. Try to be less picky with 2D Classification, rather rely on 3D-based alignments and classifications to separate your particles. Obviously if your goal is to achieve very high resolution, or to understand flexibility with 3DVA+3DFlex, you might want to collect more data - to be merged with this one, not to replace it. But first you need to know how far you can go with this one, and do a few tests (for instance how much resolution you loose if you use only half the particles?). Then you’ll know how many new images you’ll need. But even then, it depends on the biological question you’re trying to answer. In many cases low resolution maps already give interesting answers.