Particle distribution is ok but the map is extremely anisotropic

Jiarui · November 19, 2024, 9:07am

Hi everyone,

Thanks in advance for everyone interested in my question.
I have been having troubles to generate a usable map from my particle sets. I speculate that the problem happened during NU refinement. Particularly, the orientation distribution generated by NU refinement didn’t look really bad, but somehow the cFAR value for my map is extremely low and the map is extremely anisotropic from one direction (having those sliced or stretchy features). The dataset is a combination of flat collection and 40-degree tilted collection, so my guess is that the preferred orientation isn’t the real causable factor for the anisotropic map. Could anyone please help with this issue?

J256_cfscs_half_angle_20_iteration_008_with_autotight_mask

!

J256_viewing_direction_distribution_iteration_008

Many thanks!

rbs_sci · November 19, 2024, 9:30am

What do 2D classes look like? How many particles? Likely a lot of junk present (or particles which are actually noise if heavily overpicked, as they align pretty much randomly and can mask preferred orientation for that reason…)

Also try just using the 40° tilted data for a reconstruction?

Jiarui · November 19, 2024, 9:49am

Hi there,

Thank you very much for your input. I don’t have 2D results immediately but am running it, I can follow up with the figure once it is done. The reason I don’t really run it is because I learned from other people that using 2d to run move junk particles may accidentally remove rare views. So I have been using heterogenous refinement or 3DVA to clear junk particles, by including junk volumes or removing particles contributing to anisotropic frames, respectively. One particle set I collected before (not the dataset I showed in main thread, this one includes flat particles and 20-degree tilted particles) that was cleaned by such strategy using 3 rounds of 3DVA. This generated equally bad volume (please see below). 3 rounds of heterogenous refinement on the old dataset also generated comparable results. Initially I thought there might be something wrong with the preferred orientation, but after collecting the current 40 tilted dataset, I think there might be something wrong with the method I’m using.
I will also try to analyze the 40 degree dataset only and see the difference.
Thank you again!
J1217_viewing_direction_distribution_iteration_007
J1217_cfscs_half_angle_20_iteration_007_with_tight_mask

Jiarui · November 19, 2024, 7:21pm

Hi there,

just to follow up where we left, please see the new refined map, corresponding orientation distribution, cFAR value and, 2D class below. The map is still very anisotropic from one direction.

J304_cfscs_half_angle_20_iteration_011_with_autotight_mask

J304_viewing_direction_distribution_iteration_011

Do you have any suggestions on how to improve this? Thank you!

rbs_sci · November 19, 2024, 10:55pm

Well, that’s better… can at least see things which look like main-chain although anisotropy is still severe.

I’d get rid of the low resolution and messy classes, clean up with another couple of rounds of 2D and then pick two or three clearly different orientations to try to generate a good reference/refinement, then add everything back in and use the “Rebalance orientations” job to remove the extremes of one view, then maybe try 3D classification. It’s worked well for me a couple of times (although it’s also not worked so well once as well…)

olibclarke · November 19, 2024, 11:35pm

In addition to the very good suggestions and advice from @rbs_sci so far, I would also consider the possibility of anisotropic flexibility - basically flexibility at an interdomain hinge.

This can also give results similar to what you are seeing. In this case, local refinement with a mask around one half of the molecule may help improve map quality.

I also wonder looking at your 2Ds whether you might have some quite different discrete interdomain orientations - If you use “create templates” starting from your consensus and compare the projections with your 2Ds, are there any 2D classes that are not readily attributable to an orientation of the consensus map?

Jiarui · November 19, 2024, 11:57pm

Thank you very much again for your great suggestions, will test it out!

Jiarui · November 20, 2024, 12:26am

Hi Oli,

Thank you very much for your input. Yes, this is a protein complex having one protein docking into the pocket of another portein through a loop, so flexibility at a hinge is the case. I haven’t tested local refinement but will take a look.
Regarding the views, most of the 2d classes do look like projections of currently finished model. There are a few classes that look differently, but I think they are more likely to be dissociated particles than different conformation, because most of those particles usually are grouped in one class of ab initio that has much smaller size.

Mark-A-Nakasone · November 20, 2024, 1:22am

great advice so far.

If you do “Rebalance 2D classes” https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/particle-curation/job-rebalance-2d-classes-beta. Maybe with 8-12 super classes and rebalancing factor = 1 … do all of the super classes look the same ?

If you take these particles for the NU-refine, what % go back to the 0degree movies and what % go back to the 40degree series ? Some of the 2D classes seem to have 2 units, while others have 3.
Your 2D classes have potential. How are you picking ? It could be worth it to take discrete ones for TOPAZ training and see how those look. You may have to train 0deg and 40deg separately for optimal results.

olibclarke · November 20, 2024, 3:10am

Based on this I would definitely have a go at local refinement (and/or 3D classification or 3D-VA to separate different interdomain orientations). The case you are describing sounds very much like interdomain flexibility to me

Jiarui · November 20, 2024, 5:10am

Hi Mark,

Thank you very much for your suggestions!

I just ran the ‘Rebalance 2D classes’ job with 10 super classes and rebalancing factor of 1. Although it is pretty hard to attach all the results here in the thread, I would say that these super classes look pretty different. The first few superclasses contain more templates while the last one contains only 2 templates. Templates within the first few superclasses are also not contributing to the same view, for example, superclass 0, which clearly contains front view and back view.

I also attach the matrix here.
.
This is my first time running this job, could you please provide a few suggestions on how to interpret the outputs of this job? Thank you!
I’m not super sure how I can back trace those particles to their original micrographs, so I don’t have a direct answer to this.
I was using template picker to pick all those particles. Template was created by an anisotropic map generated from previous collection. Doing Topaz is definitely a great idea, but I think recently our school is having trouble running it. I will for sure test it out once it is back to work again.

Thank you very much again for all your great suggestions!

Mark-A-Nakasone · November 20, 2024, 2:58pm

Hi @Jiarui

If you have many less re-balanced particles than for what went in (e.g. 100k going in, 10k coming out) it would suggest some orientation. What I do not see in your super classes are a view 90 degrees to what is there, but I’d agree there is some rotation. Like seeing every side of a hot dog, except the view you’d eat.

To trace the particles back, what if you run the extract job again. Start a new extract particles from micrographs. Input all of the particles from the NU-refine job, input the micrographs from the 40degree tilt only. This should only extract particles from your NU-refine associated with the high tilt micrographs. Maybe another method => Manually curate particles job => input high tilt micrographs + particles from NU-refine. This job also associates particles and micrographs. You just want to see how many particles from the NU-refine are actually in the high tilt. It is a good idea from @rbs_sci | assuming you processed your 0deg and 40deg set with their own patch motion and patch CTF, then combined.

hope this gets you somewhere

Jiarui · November 20, 2024, 10:23pm

Hi Mark,

Thank you again for your follow up.
The side view is definitely rare view, and I can only see them in this dataset which contains 40 degree tilt. These classes were classified in other super classes:
Superclass 1:
J342_templates_belonging_to_superclass_1
Superclass 2:

I do have a question regarding the output of this job - from the tutorial, what I understood is that we should get similar views classified into the same superclass. However, most of my classes look like a mixture of different views, could you please suggest a possibility for this result? And please correct me if I’m wrong, my strategy after this point should be using output from rebalance 2D to reconstruct a new map, which hopefully can get rid of some of the anisotropy, is this correct?
Regarding back-tracing the particle, it was unfortunate that I process both dataset at once using one motcorr and patch CTF. I was not aware of the issues I would possibly encounter, but I will start a new workspace to redo it and see the results. Could you please explain more why solely processing 40 degree tilted dataset might generate better result than combining flat + 40 tilted?
Thank you again for all your help!

DarioSB · November 20, 2024, 10:50pm

What are your settings for the rebalance job?

Jiarui · November 20, 2024, 10:58pm

Hi Dario,

I only changed the two parameters suggested by Mark and kept other things default.

DarioSB · November 20, 2024, 11:04pm

How does it look when you use a lower rebalance factor?

rbs_sci · November 21, 2024, 12:48am

How many iterations are you using, and what Initial classification uncertainty factor are you using? Because that result looks like you’re not giving the classification algorithm time to really sort things out properly. 100 classes may not be enough. The default settings for CryoSPARC 2D classification are way too optimistic for 99% of datasets, particularly as particle count increases. I’d probably try 300-400 classes, uncertainty of 8-10, full iterations of 2-5 and online-EM iterations of [enough to cover your particle set twice].

A modification of @olibclarke’s suggestion regarding masking and local refinement might be worth a shot also - pick one lobe of the protein and pick and classify that. With a tight mask or tight windowing around it, you might catch enough variant views to get a good reconstruction of one half (you’ll have to tight window this step), then do a low-resolution reconstruction (15-20 Ang) and with any luck the other half will have enough density that you can mask and centre that, before local refinement (parameters will certainly have to be experimented with) to get the other half to high(er) resolution. This is messy and may not work, but might be worth a shot if all else fails.

Jiarui · November 21, 2024, 2:19am

Hi @rbs_sci,

Thank you for your follow up. I used 3 full iterations, 40 online-EM iterations, and 600 for batch size per class for the 2d classes I showed above. I definitely forgot to adjust the uncertainty level as my particle set gets cleaner, so thank you for pointing that out! I now also increased the number of classes to 200 so that each class will only have a few thousands of particles for my recent 2d jobs.
I’m also testing the local refinement, will modify accordingly and follow up when I get the results. Let’s see how good it is!

Sincerely,
Jiarui

Jiarui · November 21, 2024, 2:22am

Hi Dario,
I tested two lower rebalancing factor and I think the results look similar to rebalancing factor of 1. Two representative figures here:
Superclass0 for rebalancing factor of 0.5, which is a mixture of front and top views:
J407_templates_belonging_to_superclass_0
Superclass0 for rebalancing factor of 0.1, which is a mixture of front and back views:
J408_templates_belonging_to_superclass_0
Any thoughts on this?

Mark-A-Nakasone · November 21, 2024, 2:27am

Hi @Jiarui.

Great suggestions from everyone, more to try.

to your points in this thread.

I really only use Rebalance 2D Classes with super classes (rebalance factor = 1) to see the obvious, it is a fast job. For very rare views you would have better luck pickup from @rbs_sci i.e. more intense 2D classification jobs.

*you also may have pseudo-symmetry so it is hard to say about the views based on 2D.

I also would check the completeness of the Fourier Sphere in the Orientation Diagnostics Job https://guide.cryosparc.com/processing-data/tutorials-and-case-studies/tutorial-orientation-diagnostics. Is it a sphere or have parts missing. Maybe have a quick read of https://doi.org/10.1016/j.sbi.2024.102918 to see where SPA is at (we are not there yet).

Processing everything together could be an issue or maybe not so bad. I actually use EPU to collect 0deg and high tilt in the same session, but in older versions just run one EPU session for 0deg and another for the tilted.

In many figures I see the 0deg and 40deg processed differently. Import job (includes .xml metadata if using AFIS to make optics groups) => Patch Motion Correction => Patch CTF Estimation. Some people can tweak settings between each data set. Separately processing 0deg and the tilted was always something I did based on the literature and my need to see which particles came from where. I also find curation thresholds are different between the two sets (e.g. I am less strict with my high tilt micrographs).

I could use blob picker, select 2D classes, then I did not mention this before, but I also train TOPAZ 2x: once for 0deg and the other training for tilted. I find particle picking can be slightly different in each. After picking and some 2D classification I combine the particles for multi-class ab initio and go from there. Now the particles in the combined NU-refinement or Homogeneous Refinement can easily be traced back to 0deg or the tilted.