Number of classes affects results significantly

Why does number of classes affect results so much? Is there a way to get more consistent results? See below for two very different results on the same stack, the only difference in the 2D job is classes=20 vs classes=30. Related question, why is the number of particles different even though the same input stack is used (1,297,609 input particles, but slightly different numbers appear on the job summary).

1 Like

This is because when you go at/below 20 classes, cryoSPARC automatically switches off Force/Max over poses shifts. I understand the reason for this but agree it is confusing, particularly as the number of iterations is not changed to match (it usually takes longer to converge with this parameter switched off)

4 Likes

Hi @August,

Seconding @olibclarke’s reply regarding the activation of marginalization for class numbers ≤ 20. This can be disabled by manually changing the parameter’s value.

For your second question, I suspect the disagreement in the number of reported particles is because as of CryoSPARC v4.1, 2D Classification will reject “duplicate” particles by default. A pair of particles is considered to be “duplicates” if they have refined such that their shifts place them within less than the minimum separation distance, as described in the 2D Classification job page. Since duplicate detection depends on the refined alignments, each re-run of 2D Classification may indeed find slightly different sets of duplicates, leading to different numbers of particles rejected. Note that activation of duplicate removal, as well as the minimum separation distance, are both parameters that can be altered.

Best,
Michael

1 Like