Signal of adjacent particles mess up 2D and 3D classification

Ufuk · June 1, 2023, 9:32am

Hello,
I’m working to process an engineered small particle (6x6 nm) that is barrel shaped with 8 helices forming the inner barrel and 8 the outer. The micrographs are quite crowded:
J5_non_doseweighted_aligned_micrograph_for_013618227542260032759_foilhole_8687127_data_8686329_8686331_20230510

When extracting particles using the optimal box size, signal from adjacent particles influences 2D classification.

It is a small particle with low SNR, so I’ve made some progress with Batch size = 200, #iterations = 40. Force max over poses/shifts = off has had some positive impact as well. I’ve also tried playing with initial classification uncertainty factor = 1.

However, when I take it to Ab Initio 3D reconstruction, many classes try to reconstruct adjacent particles. When this happens, the orientation issue gets amplified.

Classes that do not reconstruct adjacent particles have a much better particle orientation distribution.

Still, due to adjacent particles interfering with 2D and 3D classification, I’ve had to exclude 99% of my data.
Does anyone have good advise on what I can do to avoid adjacent particles from interfering with classification processes?

(I wanted to add many more images for reference, but can’t due to being a new user on the forum)

Many thanks in advance,
Ufuk

olibclarke · June 1, 2023, 11:54am

Hi Ufuk,

Does your engineered particle have symmetry? If so you may want to try enforcing the expected symmetry during ab-initio and seeing if results make sense (presence of secondary structural elements, quality of density map after refinement). Quite often symmetric particles will end up flattened out during ab-initio if symmetry is not enforced.

Cheers
Oli

Ufuk · June 1, 2023, 1:53pm

Hi Oli,
The central barrel is symmetrical (C8) but the outer barrel (7 helices, not 8, my mistake in OP) is missing a helix.

CryoEM2 · June 1, 2023, 2:04pm

Use an inappropriately small box size in early stages of processing. Return to appropriately large box after you have removed many particles through classification/motion analysis. Then the quality of your model and relative infrequency of matched neighbors should make this less of a problem.

Batch size up to 1000 all good.

Well picked/centered particles can turn off (or minimize) recentering

Ufuk · June 1, 2023, 2:14pm

Thanks. I’ve in fact started doing this with a similar reasoning. But it’s good to read that it’s an idea worth pursuing. What do you mean by motion analysis? Are you referring to a specific job? (I doubt you mean motion correction)

CryoEM2 · June 1, 2023, 2:41pm

no not motion correction. I mean when you’ve separated particles down to a small class that are most similar to each other. Not dissimilar in terms of their composition (subunit missing etc) and not dissimilar in terms of their conformation (large motions, subdomain motions, high-res differences). For this, I run 3D classification yielding ~30,000-50,000 particles in each class, starting with class similarity 0 (not 0.5) and target resolution >8, and you will likely observe the classes are distinct w.r.t motion. 3DVA in series and cluster mode is another way to analyze this, and 3Dflex also though it’s less intuitive. So when you take millions of particles and find the final 50k that are all the same and high resolution, then you will notice there are only dozens from each micrograph (as you extract in large box) and they should suffer less from neighbor particle noise driving alignment.

ccgauvin94 · June 1, 2023, 6:01pm

I ran into some issues with this, albeit with a much larger particle:

What worked the best in the end was the Topaz picker to get the centering near perfect at the picking stage, and then heavy masking and restricting movement in 2D. For 3D, what wound up working was enforcing symmetry and also reducing the windowing so that most of the box is not considered during ab initio.

Ufuk · June 2, 2023, 2:17pm

Thanks very much, I’ll give that a try.

Ufuk · July 17, 2023, 9:27am

Thanks everyone for all your inputs. I’ve been working on this project for a while and managed to get a structure that is 3.2A after auto-tightening of the mask, but I trust the GSFSC without auto-tightening more (3.4A).

It has been a very tricky project. Not only are particles very small and have low SNR, they are also very densely packed so that signal from adjacent particles is considered during 2D classification. To make things worse, the particle is strongly preferentially oriented. That means that the likelihood of adjacent particles having the same orientation is tremendously increased, leading to 2D classes where the center particle is blurred and adjacent particles resolved.

I took your inputs into consideration. I made sure the particles are perfectly centred and first worked with a box size that is very small (96A for a 60A particle). It was still challenging, but I was able to get good 2D classes. After cleaning the particles, I did an ab initio and heterogeneous refinement of 10 classes. This was important because only one class gave a good reconstruction.

You can see from the graphs how strong the preferential orientation is:
J97_posterior_precision_directional_distribution_iteration_182_class_008

J97_posterior_precision_directional_distribution_iteration_182_class_008

I tried to re-extract at an optimal box size (220A). However, no matter what I did, I could not overcome the problem of adjacent particles’ signal being considered in classifications and reconstructions. I played with myriad of settings and applied very tight masks. But still, the signal from the central particle would be overshadowed by the signal from adjacent particles, so that the central particle lost its features and turned into a blob:

I think the directional distribution precision heatmap is substantiates my point. Working with the same particles, it makes no sense that the directional distribution of particles should change significantly compared to the initial model:
J112_posterior_precision_directional_distribution_iteration_003

J112_posterior_precision_directional_distribution_iteration_003

So I went back to the smaller box size (92A) and tried to improve the map via NU-refinement. After analysing the particles of the successful heterogeneous refinement map via 2D classification, I expanded the particle pool. I also generated a mask with relion, which helped a lot in the initial NU-refinement steps. Again, I played with a lot of settings. In the end, what helped was:

number of extra final passes: 10
adaptive marginalisation = off

J186_side715×712 92.3 KB

J186_top715×712 106 KB

And the directional distribution precision heatmap makes sense:

If you look at the GFFSC after auto-tightening, there is precipitous drop at 4A. Any ideas why this might be?
J183_fsc_iteration_016_after_fsc_mask_auto_tightening

The non-auto-tightened GFFSC looks much better in comparison:
J183_fsc_iteration_016

This is the state of things. I can’t use a larger box size. And any optimisation during NU-refinement (defocus, CTF, etc.) actually degrades resolution.

Do you have any inputs or tips (short of acquiring new and better data) how I can make further improvements?

Many thanks for all the help so far and in advance for any future help.

CryoEM2 · July 17, 2023, 3:20pm

I would look at the other heatmap (particle assignment). And the dip in FSC is well-characterized and discussed here frequently, not to worry about it. higher order aberration corrections won’t work until you have high-res reconstructions. have you tried 3D class? not that it’s expected to work here particularly well but it’s another great tool. Target resolution 8, # classes = total particles/40,000, class similarity 0, convergence criteria 4 not 2 (for first attempts).

ccgauvin94 · July 19, 2023, 3:49pm

Have you tried windowing, as described here?

If my understanding is correct, windowing should prevent the neighboring particles from impacting the alignment, but will still consider all the information in the box in Fourier space for reconstruction.

The dip in the auto-tightened FSC is from the mask edge. You can read about those here:

Specifically, the “Corrected” mask section will describe what’s happening. A large dip can be indicative of a mask that is too tight. In your case, I wouldn’t worry about it too much for now.

One downside to using a tiny box size is that you’re potentially leaving your high resolution information out of the reconstruction. Especially in the case of small particles that don’t have great contrast . If you collected data at a higher defocus to make them visible, then data actually gets delocalized quite a distance away, as a function of spatial frequency. If you then make the box size too small, that delocalized data will be outside the extracted box and not available for per-particle CTF estimation.

Another problem that can happen with small box sizes is aliasing. My understanding of aliasing isn’t great (would be happy for someone else to chime in) but I believe it occurs when oscillations in the CTF occur at a greater frequency than can be described by the number of pixels in the box. That means some oscillations will be combined into larger oscillations, and the phase flipping that’s applied based on the originally estimated defocus will be incorrect. This happens at higher resolution, I’m not sure if it’ll be a factor at the resolution you’re at, but it’s also dependent on defocus and box size.

There’s a handy calculator here, and you can input your settings and see if there are aberrations in the generated Fourier transform:
https://3dem.github.io/relion/ctf.html

Also, it’s always tough to tell from a surface view what the actual resolution of the map is. Have you docked the model in? That really doesn’t look 3.18 Å to me, from the surface view you posted.

Ufuk · July 19, 2023, 4:16pm

Yes, I did apply windowing and pushed it to be quite stringent. Still, after extracting particles using optimal box sizes, windowing doesn’t seem to help in my case.

I’m aware of the issues that come with small box sizes, i.e. losing on HF information at the fringes of PSFs. And with a small particle like this one that has an intrinsically low SNR, decreasing the box size below what’s optimal doesn’t do me any favours either. That’s why I did try to increase my box size to optimal dimensions, but haven’t been successful.

Regarding anti-aliasing, it does have a more profound impact on higher spatial frequencies which frequencies that go beyond Nf will affect first. I think I have a lot of other things to worry about.

We’ve docked the model, and as I said in my post, I also don’t believe that this map is 3.2A.

At this point, I think that with this data (crowded, preferred orientation, small particle), 3.5 to 3.8A is the best I can do.

ccgauvin94 · July 19, 2023, 4:25pm

I wonder…have you tried a local refinement on the low resolution structure you get out of the larger box size reconstruction, using your higher resolution structure as a mask?

I also wonder if you mask the particle so that only the symmetrical portion is being refined, and enforce the symmetry, if you’ll get better initial alignments. Perhaps you’ve tried that already?