Hi Valentin,
he PCA-based initialization never proved to be a critical component in getting good classification results. Have you seen improvements with it?
No we haven’t - but when I have used it, all the initial volumes look more or less the same anyway, even when I know from 3D-VA etc that there is a lot of heterogeneity in the dataset. Maybe this is because the clusters are too large, so there is too little variability between the different reconstructions?
Can you elaborate on this further? What do you mean by PCA on clustered volumes exactly?
I was thinking along the lines of landscape analysis in cryodrgn (https://zhonge.github.io/cryodrgn/pages/landscape_analysis.html & chapter 6 of Ellen’s thesis), where the latent space is sampled at random points to generate ~500-1000 reconstructions (which are then analyzed by PCA). This reminded me of the approach you were using to initialize 3D-classification, but it seemed like sampling the space explored by 3D-VA might generate starting volumes that have more diversity. We have found it useful already to seed 3D-classification (and heterogeneous refinement) from states identified using cryoDRGN; I haven’t tried using random clusters from 3D-VA to initialize yet.