3D classification classes always equally populated

samhaysom · September 8, 2025, 4:49pm

Apologies if this is a duplication but I am having general issues (across multiple datasets) using 3D classification where my classes are always equally populated. I am aware that a lot of the time this can represents continuous heterogeneity, and indeed on my targets (active-state GPCRs), scrolling through the classes visualises a rocking and twisting motion between receptor and signalling proteins that seems continuous and can also be visualised by 3DVA. However in other cases I see differences between classes that seem reflective of occupancy of a particular component (some have clear density while in others it is more diffuse), however the class number is always pretty equal no matter how many classes I request. This seems to indicate I’m just not getting a good separation between classes but I can’t work out what parameters to modify to get a proper separation between states. Would love for others in the community to give their recommendations on what to try and change!

Also of note is I’m generally trying to classify small scale movements (e.g. change in conformation of a helix, appearance/disappearance of a lipid etc.). Therefore wondering if others have examples of classification working with small features as I’m aware that the tool will struggle at low SNR.

carlos · September 9, 2025, 8:47am

Hi @samhaysom,

I’m afraid I can’t really help you, but maybe 3DVA in clusters mode will be more adequate for what you are trying to do? - and related, for the CS crew: I wonder if ab-initio, 2D, 3D classification and hetero refinement have some function in their algorithms that tries to equilibrate the number of particles per class (unless clearly not possible)? I many times have the impression of that happening, but most of the time I have continuous motion, too, so I might be really biased. It would be nice to know especially in cases when we are looking for rare species in a sample.

samhaysom · September 9, 2025, 10:03am

Some of the heterogeneity I am experiencing is definitely continuous and can be visualised by 3DVA but in other cases I have what looks like discrete heterogeneity in my sample (for instance presence of absence of a small accessory protein) but cryoSPARC still insists on separating particles equally between classes. I then end up with classes that show discrete differences (presence/absence of a component) but as all the classes are equally populated it seems very unlikely to me that the underlying particles are being cleanly classified into an occupied and unoccupied class. The CS documentation for 3D Classification only seems to provide advice for the opposite issue, where you get all particles going to one class (see here).

I was just interested whether others had similar experiences. Among our cryoEM team the general impression is that 3D classification doesn’t work for our use case while we get much better results from 3D Classification without alignment in RELION. However it would be nice to be able to use 3D classification as its much faster and would eliminate need to export and reimport particle stacks.

wtempel · September 9, 2025, 1:23pm

@samhaysom Please can you post

the output of the command (replacing P99, J199 with the actual project and job IDs of the relevant 3D classification job):
```
cryosparcm cli "get_job('P99', 'J199',  'job_type', 'version', 'params_spec', 'input_slot_groups')"
```
screenshots from the event log or UI that illustrate the unexpected particle distribution
a description of conflicting hypotheses (expected results) results and evidence supporting those hypotheses

olibclarke · September 9, 2025, 1:45pm

It would be worthwhile I think to do some experiments with realistic synthetic data where the ground truth populations are known, to get a handle on this. We have seen some cases similar, but other cases where realistic population percentages are obtained.

In terms of classifying small/subtle changes, we have had some success performing classification with a 15-20Å highpass filter (in conjunction with the usual tricks like employing hard classification and tweaking the learning rate).

Cheers
Oli

samhaysom · September 11, 2025, 10:39am

Thanks for your input, I will give the highpass filter a try! When you say some success was this going from a situation where tweaking these parameters takes you from a situation where classes are getting equally distributed to one where you get a very definite split into better and worse populated classes, or for the cases where this worked did it start out with a definite split? I’m trying to work out whether if I see this equal distribution behaviour it is worth persevering or if this is just a sign that 3D classification is not going to work on my problem and I would be better off using RELION classification without alignment or multibody/3DVA.

samhaysom · September 11, 2025, 10:41am

Hi wtempel, unfortunately the nature of these classifications is commercially sensitive, I’ll check what I can share and get back to you (sorry aware that isn’t very helpful for debugging)

olibclarke · September 11, 2025, 10:56am

Can’t remember the population split, but it took us from a case where we had two states (one of which was essentially uninterpretable at high res) to where both states were clearly separated and interpretable.

The other thing you can try is oversampling the number of classes, then combine the identical classes