EM newbie here. I am working with a “slab” like protein which has dimensions of ~ 100 x 200 x 60 A. After 2d classification, I see plenty of views that correspond to the 100 x 200 A face. The other two views are difficult to extract - presumable because one dimension is very small.
Here’s my advice based on experience with an elongated protein with even more severe dimensions (20 x 200 x 30). After any of these picking approaches, you should extract with 4x binning, run 2D classification, and re-pick with templates. You probably don’t need more than 4 templates because they will be heavily low-passed and you will adjust the cross-correlation threshold after picking. In my experience crYOLO also worked well once I had sufficient good picks for training - I imagine Topaz will work too. You can train them based on 2D class selection based on these initial strategies.
Overpick with blobs (Cryosparc uses all the blobs - a series of circles and ellipses from the minimum to the maximum size)
Manually pick w/ intentional bias to your difficult view up to ~5k particles
Generate templates from negative stain reconstruction or other initial model
You can try a more aggressive low-pass than normal, like 30-50 Å, and remember you can combine picks optimized for different views, then exclude duplicate particles using Relion or star.py --min-separation (from pyem).
I still haven’t found an approach I’m entirely satisfied with - particle picking is not a solved problem! Good luck, let us know what’s working in your case
Regarding your third suggestion - using templates from an initial model.
Should I worry about model bias if the structure of the same protein bound to another substrate is used for template generation / 3D classification with my current data ? My current 6A anisotropic map looks a lot like the previously determined structure.
I posted this query on CCPEM too- hoping to get a consensus from all experts .
No, you don’t need to worry about model bias. You are going to use a 20+ Å low-pass filter on your templates, and then during refinement use a 20 - 40 Å low-pass filter on your initial models and control overfitting using the independent half-set technique.
Model bias would be problematic if you kept high resolution information during template picking, or if you used the same model without filtering to initiate refinement. In the modern refinement packages, you would pretty much need to overfit on purpose.
I suggest reading through the HIV Env saga to understand the problem and the protections the field has implemented in response.
I wanted to give a quick update on what sort of “worked” or didn’t for me:
Particle picking
(i) overpicking using DogPicker, followed by 2D classification. Then, template picking in RELION
(ii) template picking with cryoSPARC does not seem to work for this data set…
Ab into model generation
(i) RELION didn’t work
(ii) Imported particles from RELION 2d classification into cryoSAPARC and generated ab initio models. cryoSPARC was able to give two separate conformations
Refinement
(i) One conformation refined to ~ 5A and the other to ~ 7A
I would like to ask the cryoSPARC community what the best way is to use these models to extract more particles from the micrographs ?
Generate templates from the 3D volumes and use those for particle picking ?
or
sort particles in these volumes by 2d classification and use the 2d classes as templates?
I like the “create templates” method. You can play with the number, say 8 - 16 templates, to get a distribution you like. Actually, a very small number of templates (2 - 4) is almost certainly sufficient, but the template picker doesn’t seem to be get much slower with more. Most of the time must be spent on I/O and micrograph FFTs.
I make the templates quite small, 24 - 48 px.
@spunjani I noticed the “create templates” tool seems to sample evenly in all 3 DoF instead of just the 2 relevant ones (viewing direction). Often, there will be two templates with the same approximate view, but different in-plane rotation.