Bug: final particle numbers displayed do not match result when hard classification is enabled in 2D Classification

Hello,

In v4.4.0 and v4.4.1, I ran 2D Classification with Hard classify for last iteration enabled. I noticed the particle numbers displayed from both the 2D Classification & Select 2D don’t match the result particle numbers. For example, when I select Class 9 (23660 particles) in the screenshot below, I get 23479 particles selected.

Select2D will output 23479 particles with the selection above.

When I select all classes, I get 7000 less particles than total.

Below shows a Select 2D displaying the results of a different 2D Classification job with the default parameters. Here the particle numbers match the result.

Best,
Kookjoo

We see this all the time now, even with standard settings (not hard classification). particle counts in 2D are not correct, likely displaying earlier iteration numbers.

Thank you for reporting @kookjookeem @CryoEM2 – we’re investigating this.

@CryoEM2 can you please confirm that even with Hard classify for last iteration OFF, you see the particle counts discrepancy in Select 2D (as @kookjookeem’s screenshots depict)?

I think what’s going on is 2D classification is showing all particles in the classes, and 2D selection is showing those which are not discarded at the end via remove duplicates. This happens also with legacy jobs, and I have not tried running job with Hard Classification turned ON in new 2D. This is the intended case, and always has been, but somehow it is being displayed in a way that is only recently bringing my (our) attention to it and making it seem like a bug.

Ok thanks!

After a deep dive into this, we’ve isolated the original issue (i.e., the discrepancy in particle counts in Select 2D) and it is limited to the case when the new ‘Hard classify for last iteration’ parameter is ON. We’ll be working on a fix for this. Thank you @kookjookeem for reporting!

@CryoEM2 – if I understand correctly, I believe the discrepancy you’re describing is a separate unrelated phenomenon. 2D classification removes duplicates after producing the final class average figure (which has the green particle counts overlayed). Thus, these green numbers will not match the class counts in select 2D if there are duplicates – this is indeed expected behaviour.

3 Likes

Hi all,

This bug has been fixed as of CryoSPARC v4.5. It was an issue with 2D classification – some particles would be erroneously output with a posterior score of 0 when hard classification was activated. The erroneous posterior scores will still be present in the affected particle datasets in 4.5, but the 2D classification jobs that produced them can be re-run to create particles with fixed metadata.

Best,
Michael

4 Likes