When I try to initialize 3D classification with a mixture of unique and non-unique volumes - e.g. 2 copies of each of 7 classes from heterorefinement, for example - I get the attached error.
Is there any reason for this restriction? When using relion we routinely seed classification without alignments with identica input classes.
have you tried multireference input star of the same volume in relion? i find it populates one of them completely and the other(s) not at all for each reference identity. K classes I think is different, it slightly changes the input models so they donât have identical probability.
Thanks for flagging this! Youâre right that this probably shouldnât be a restriction â we are looking into it and will adjust in a future release.
Note that the current way we check for âuniquenessâ is via the path to the volumeâs mrc file. So one potential workaround is to first backproject a number of (identical) volumes by cloning a bunch of Homogeneous Reconstruction Only jobs with the same particle sets, and then to pipe those volumes into 3D class.
It seems this restriction is gone in recent versions, but there is a bug I think, at least in v4.4.
When I use a mixture of unique and non-unique volumes, the duplicate ones misbehave.
For example - letâs say I have 10 different classes, and then include 10 copies of the consensus volume to make 20 classes.
The initial volumes at the top of the log look normal - but in the first iteration, all the plots for all bar one of the duplicates classes are blank, and the classes are empty (whereas the non-duplicate classes behave as expected)⌠If I use slightly different volumes (e.g. perturbing using volume tools), this does not happen and they behave normally).
Thanks for reporting. Unfortunately, I wasnât able to reproduce this behaviour with a couple of different dataset / class combinations. Any custom parameters in this job?
Ok thanks, was able to reproduce this! This behaviour happens with identical classes AND hard classification on. This is in some ways âexpectedâ behaviour as itâs implemented now because we use a argmax() call on probabilities during the E-step â if probabilities are numerically identical, this will always return the âfirstâ volume (in the inputs). Thus in the first iteration all other identical classes are emptied out and stay empty.
We have an idea for how to improve this in the future (sample a volume according to the posterior rather than simply argmax during the first iteration of hard classification). For now, youâll have to add some noise to avoid this behaviour.