Forced symmetry in 2D classification

Thanks @DanielAsarnow for the tip!

@rj.edwards glad we can help :slight_smile:
Marginalization is a generic concept: When we perform inference of an unknown target variable (the 2D class density images in this case) while there is also another unkown latent variable (the 2D pose and shift of each particle), marginalization means that instead of trying to estimate just a single value of the unkown latent variable (pose), we should instead keep track of every possible value and how likely it seems given the data. In 2D classification this corresponds to keeping a probability distribution over possible 2D angles for each image. Every angle gets a probability value (e.g. 0.1, 0.2, etc) and then when we are reconstructing the target variable (2D density image for each class) we combine the experimental images by averaging them over poses, weighted by the probability of each pose.
So without marginalization (i.e. with “force max over pose/shift” on), we only keep track of the single maximum probability pose for each image, and the 2D class density image is just the average of all particles in the class, each from a single pose.
With marginalization, we “blur” every image by weighted averaging it over several poses, and then add all those averaged images together to get the reconstructed 2D class image.
So the “max” is really an approximation to marginalization (which is the more theoretically correct operation to perform) - we are replacing the full probability distribution with a point estimate. But in practice, “max” saves a lot of time and can actually be beneficial in many cases where the “width” of the probability distribution in marginalization is mis-estimate and too much blurring would happen.

Hope that helps :slight_smile: