Junk (non-particle, non-protein) removal works quite well in Cryosparc using either ab initio heterogeneous reconstruction with two classes and an expected class similarity of 0, or multireference refinement starting from a single good class and a bunch of identical random junk classes (see attached).
This doesn’t work very well though for getting rid of particles that are the protein of interest, but are “less shiny” - high defocus, thick ice, etc. These just end up classifying with the good particles in the main class. One approach that has been used to get rid of these (which can be particularly annoying for small particles) is “random-phase” 3D classification (http://www.sciencedirect.com/science/article/pii/S0092867416305700) - basically classifying with one good volume as normal, and another volume that is low pass filtered progressively less aggressively as the classification proceeds - the idea being that the good particles will sort into the class with the higher res initial volume, and the bad particles will sort equallly well into the class which has been kept artificially low resolution (but same basic shape).
Would it be possible to implement a similar procedure in cryosparc, by allowing users to specify fixed low pass filters for specific classes in multireference refinement?
I should note that taking the particles from the good class here and repeating the same procedure, but with different initial junk classes, gives 99.9% in the good class at the end of multireference refinement - so the cleanup is certainly pretty self-consistent if nothing else.
By the way, another anecdote - I just tried this strategy (initializing multirefine with one good class and a bunch of junk classes) on another dataset, a small membrane protein which is quite heterogeneous even though it had been cleaned in 2D, and it worked really well - repeating the cleaning procedure 4X I went from having something that was very anisotropic, poor density, sausages for helices, to a nice clean 3.4Å map.
You can use randomize.exe tool (http://grigoriefflab.janelia.org/randomize) to prepare the garbage trap.
After sequential cryosparc heterogenous refinement runs against 40 and then 25-A phase randomised reference, my dataset reduced by 47% with substantial improvement in FSC; overal resolution went up from 6.1 to 5A.
I will now try Qiang Zhou’s original relion-1.4 mod to compare.
Hi @peter.cherepanov, this is an interesting data point, thanks. We will work to include something similar to random-phase classification.
That would be most helpful, although the “manual” precedure seems to work surprisingly well, at least for this dataset. The difference is that random phase is introduced only once at the beginning of heterogenous classification (doing it at each cycle requires modification of the code, of course), but it still works. I can share the dataset, if you like. Again, this is anecdotal, but in our case Relion seems to do less well than cryosparc at removing junk particles. (BTW, as in the original post, heterogenous classification against a set of identical random references - as suggested above by olibclarke - also helped, but I did not compare results extensively).
I’ve done this at higher resolutions without success - but the phase-random reference quickly became another regular structure.
I wonder if single/first iterations are doing most the work in cleaning at the beginning?
It could be case-dependent, and perhaps your particle dataset is realtively clean already. When it does work, you get 80-90% in one class, and obvious high-noise junk class(es). But I guess it is important to check refinement box size (i.e. try without binning) and the initial resolution settings (i.e. there is no point to introduce high frequency noise in the first place, if we mask it out by a low-pass filter). I tried a range of initial low-pass filter in refinement and multireference refinement settings, which have to be well below the final resolution of course; 8A works fine (final resolution ~4.5A).
May be relion-rp will work better in your case (there, random phase filter is introduced before every cycle).