Processing a somewhat biased dataset

TingzhenS · March 29, 2026, 10:27pm

Hi, society.

I am currently processing a somewhat biased dataset. At first, I followed a 2D classification- 3D modeling procedure, and I realized there was a bias in particle orientation. Although I was able to get a model at 3.5A (suggested by Cryosparc), there were apparent streaking patterns in top view. (Top in picture)

On the distribution heatmap, there still seems to be some density from the unpreferred orientations, I decided to give the orientation balancing function a shot. Being inspired by a case study, I decided not to heavily rely on 2D classifications. The details are as follows:

I applied template picker to pick particles from 3200 micrographs, followed by two rounds of 2D classification to get rid of apparent junks (~3,000,000 particles left).

Then I directly build 3 initial models, ran two rounds of heterogeneous refinement. (~1,000,000 particles left for the good class).

I rebalanced orientation with 60 percentile (~500,000 particles left).

I further built two initial models out of the balanced particle stack, followed by heterogeneous refinement (380,000 good particles left).

I used the good particles for homogeneous and non-uniform refinement and ended up obtaining models at 3.8-3.9 A. This time, the anisotropy from the top view seems weaker (probably). And I can clearly see some features like alpha-helices (lower left corner) in the rest of my model, which match our hypothesis. (Bottom in picture)

I am wondering why the resolution is trapped at 4A-ish. I wish to further improve the resolution with my current dataset (if possible). Thanks in advance for any suggestions.

AaronS

carlos · March 30, 2026, 12:58pm

Hi AaronS,

I’m the flexibility-detecting bot of this forum.

What happens if you:

Go back to the 1 mi ptcls set; make a broad mask around the region that seems to be moving the most (is it the left part on the top left panel?);
Run 3DVA with that mask, limit resolution to 8 or 10 angstroms;
Check the output in clusters mode, asking for - say - 5 clusters; the shape of the landscape is important to consider;
Refine each cluster individually, either in non-uniform, or local refinement masking the opposite region?

TingzhenS · March 30, 2026, 1:44pm

Hi, Carlos.

Thanks for the reply. Yes, the left end of the top view is quite flexible. I have run a 3D classification with the whole thing, and it turned out that region is present on some classes, while absent from the rest.

So, I think in the worst case, we might neglect that tip region, while only focusing on the rigid part. But the rigid part still seems raggy.

Or, did you mean that masking that flexible region actually improves the overall resolution?

Thanks,

AaronS

carlos · March 30, 2026, 2:05pm

3DVA will focus on the differences of the masked region, and it usually works better than if you just run it without any mask. My hope is that the movement of that region is related to the flexibility of the whole thing, so you might clusterize in a resolution-improving way. If I am wrong, then masking the opposite part should help because now you’ll be neglecting the tail that moves too independently. If you prefer to play with 3D classifications instead, masking the region where you want it to focus might also help. I prefer 3DVA, one of the reasons is that it will show you how many classes to ask for.