Hi,
At the end of O-EM, the batch avg class ESS is 9.635 with 3 main classes. After 5 full iterations, the batch avg class is 3.982 with particles dispersing into more classes. Is this the expected behavior as classification nears convergence? If not, will the presence of continuous heterogeneity contribute to this behavior?
Thank you,
Joon
At the end of O-EM:
At the end of 5 full iterations:
Hi @joonpark,
The final full-batch-EM iterations do tend to reduce avg class ESS significantly. However the fact that the batch CESS is ~9.6 with 10 classes might be a sign that you should increase the number of epochs through the data. For reference, in the 100-class example in the 3D Classification tutorial, I had the following:
After O-EM:
Batch avg class ESS: 11.759
After 2 full iterations:
Batch avg class ESS: 2.562
1 Like
Thank you so much, @vperetroukhin!
Iâm glad to know that what I observed was not weird. For this job, I actually kept âNumber of O-EM Epochsâ at 3 and increased âNumber of Full Iterationsâ to 5 because of the significant batch class ESS drop during full iterations. I will follow your advice and keep âNumber of O-EM Epochsâ at 5 or more.
Have a great day,
Joon
So in cases where ESS is still at ~9 after 2 full iterations, would you recommend increasing the number of full iterations, or the number of O-EM epochs? From looking at the log it seems like the ESS plateaus at ~13 during O-EM, then decreases to 12, then to 9 in the last full iterations. So I guess more full iterations through the data might be the way to go?
EDIT:
It would also be very helpful to be able to continue from a previous classification run. Having to restart the entire thing from scratch to test a different number of final iterations is not ideal
UPDATE: Changing to 10 full iterations improved matters a lot - both decreasing ESS, and dramatically improving the appearance/diversity of classes. I suspect these defaults could do with some tweaking based on experience so far.
@vperetroukhin It would also be very helpful to have an option to output volume series for every âfullâ iteration. It seems like more full iterations is better, but only up to a point - too many and classes start to become noisy, presumably from over-refinement. But having to run 5 different 18hr jobs with different numbers of full iterations is a waste - would be good to just run one, and then I can compare the classes over the full iterations and decide where the best point is to stop.
7 Likes
It would also be very helpful to have an option to output volume series for every âfullâ iteration. It seems like more full iterations is better, but only up to a point - too many and classes start to become noisy, presumably from over-refinement. But having to run 5 different 18hr jobs with different numbers of full iterations is a waste - would be good to just run one, and then I can compare the classes over the full iterations and decide where the best point is to stop.
FYI this is now implemented in the latest patch.
1 Like