A newbie here trying to learn some in-depth CS techniques.
I have a dataset that has pretty severe preferred orientation issue. Orientation rebalancing was tested but didn’t yield good results, so I’m trying to solve it through particle cleaning process if possible. I learned from this thread (Possible preferred orientation- how to improve data?) that doing 2D classification cleaning may occasionally throw away rare views, so I tried several rounds of heterogenous refinement to clean. The cFAR value seems to improve but the overall resolution was much lower than before cleaning and the stretchy feature still exists in the density. I also learned from a talk that people also do several rounds of ab-initio to clean the dataset, so I have a few questions regarding these:
Which of these two methods (heterogenous refinement vs. ab initio) would be better for cleaning preferred orientation issue? And how many rounds would be desirable if there are 1M starting particles?
If heterogenous refinement is better, should I always use the same group of junk volumes in every round or should I always change to different junk volumes? Or should I just inherit volumes from the previous run?
I would guess that you would have more success with het refinement, though it can be unpredictable. On the same 1M particles, I would try 1 junk 4 same good in one job, 1 good 4 junk in another, and some mixes in unique jobs. having several different initial references at different resolutions and quality’s and trying them in different scenarios should work. Once het refine does what you want (peels some preferred orientation particles into an anisotropic map and also has a nice class), you can iterate taking the nice class particles and going again, and if you use the same reference volumes it will behave the same way, or if the nice class volume is improving you can swap it in too.
identity of junk is not important, so long as they are very-junk. Both of these job types also tend to sort particles based on their viewing distribution, so be careful that a junk class is not just a different view (run 2D of the junk and make sure it’s junk). Oli suggests the junk can be ab initio of 100 particles or termination after 5 minutes or something.
1M is not a lot to sort this way. If you use 5 references, you will likely only have 200-300k left after 1 round, and 40-50k after the next round…
2D cleaning gets rid of rare views only if you select for the good classes. If instead you select for all classes EXCEPT the extreme junk and EXCEPT some of the preferred views, it should help.