Thinking about the new 3D-classifcation (beta) job. Now that we can use 100 classes in a single classification, might it be possible to add a parameter to intelligently combine similar classes?
I am thinking either (A) based on particle flow - if at convergence, two classes are exchanging particles with each other, but not with other classes, they may effectively be one class and/or (B) based on real space correlation - allowing the user to automatically combine classes if they have pairwise correlation values above a certain threshold.
This would be useful because it would allow users to use large initial numbers of classes, in order not to miss small classes with significant changes, but not have to wade through analyzing 100 classes individually.
Hey @olibclarke – apologies for the delayed response on this! Both ideas noted. With respect to B), have you used the Rebalance 2D Classes (BETA) job? It does something very similar but in 2D.
Just an update to this as I’m not sure I explained well - what I would really like to see is a clustering approach - kind of akin to the hierarchical clustering approach used for combining multiple crystal datasets in X-ray crystallography.
Some kind of alignment and pairwise real space correlation approach could work perhaps? Or maybe a PCA based approach similar to what is used to generate the initial models? Something to automatically generate “clusters” of similar classes in the output, which can be downloaded and inspected separately.
This would greatly facilitate analysis for classification jobs with 50+ classes. Even better would be if it were possible to automatically annotate the sites of difference between different clusters!