Strategies for high resolution refinement

tarek · December 4, 2020, 6:39am

Hi,

I would like to start a discussion about common strategies for high resolution refinements within cryosparc. If anyone has an example where it was beneficial to change software packages feel free to chip in.

If not stated otherwise default parameters were used.

My basic workflow starts with MC2 aligned movies, Patch CTF estimation, iterative 2D/3D classification and finally NU refinement (per-particle scaling). Usually, switching from MC2 aligned mics to cryosparc patch correction improves the resolution by 0.3-0.5A (0.832 A/pix). That way I achieve for a 120 kDa protein a resolution of 2.4 A (=0.7 Nq).
The data was acquired with AFIS, which seems to be calibrated quite well.
Using Global CTF correction (Tilt/trefoil) only marginally improves resolution by 0.1 A. Refinement of Cs consistently results in ~2.8 instead of 2.7 mm with no meaningful improvement of the refinement. Tetrafoil refinement usually worsens the results therefore I refrain from using that.
Per-particle CTF estimation may improve by 0.1 A.
Local motion correction improved by additional 0.1 A.
Finally I’m stuck at 2.2 A and piqued to hit 1.X A.

The map indeed looks like 2.2 A, masks generated by cryosparc (refinement and FSC) are in this particular case quite generous.

I’m curious about your strategies and hope for active discussion.

Cheers,
Tarek

user123 · December 5, 2020, 10:38pm

Hey Tarek,
Nice work increasing the res here! The gap from 2.2A to 1.x can be a beast on the K3/K2. I am inclined to think that if it has gone this far it may go farther.

Have you tried bayesian polishing? At this resolution you may get some good benefit from it.
For a visually improved map, try density modification (phenix.resolve for cryoem) as an alternative to sharpening. I wouldn’t report the resolution coming from this but visually it can increase 0.2-0.3A for areas with good density. I’ve seen it’s not so good with highly variable local resolution and tends to oversharpen lower res regions.
Heterogeneous refinement can pull out “bad” particles for some small improvement. I’ll try hetero refinement into two to four classes with max refinement box size then homo refine the best one and repeat until the res doesn’t increase. At some point the outputs become identical at identical resolution, but my protein is pretty homogeneous.
Extract with a larger box size until the res doesn’t improve or gpu memory is limiting.
Did you create optics groups for per-group ctf optimization?

-Aaron

tarek · December 11, 2020, 8:42pm

Thanks for your suggestions Aaron.
Why do you assume I used a Gatan camera?
In fact, data was acquired on a Falcon3 in counting mode.

Despite the Bayesian Polishing your scheme fits mine pretty much.
Re-considering the box size was a good idea, I realized mine was to narrow to recover the full CTF.
After re-extraction I did another CTF refinement and local motion correction followed by NU refinement.
Now I’m stuck at 2.05 A and even more eager…

user123 · December 13, 2020, 12:57am

Hey tarek, nice improvement! I should not have assumed, and now I’m sure you’ll get that extra 0.05A.
I think the next best option is iterative bayesian polishing. Run relion_reconstruct on the half sets of particles from you NUR job, postprocess, then train on 5-10k particles, then polish and refine again in cryosparc. This usually gives me the best improvement.
You’re getting into the range where ewald curvature correction may help. Run through relion_reconstruct and postprocess. Run a parallel postprocess on the cryosparc half maps to detect improvement.
ALSO, you can try the new NU refinement in v3.0, which now has ctf optimization, but the new algorithm alone may give an improvement.

tarek · December 13, 2020, 8:21pm

Hi Aaron,

at the moment I try to stay inside cryosparc, simply for the reason that I would need to do movie alignment again with relion. Do I get it right, that you would always run the refinement with cryosparc and only reconstruct/post-process with relion?
What improvements would you expect from polishing?

Ewald sphere correction for a particle diameter of ~ 100 A seems to me less relevant, however I can give it a try.
Using the latest NU refinement improved resolution from 2.1 to 2.05, using on-the-fly CTF correction yielded 2.07 A.

user123 · December 17, 2020, 10:22pm

Hi tarek,
To my knowledge bayesian polishing is the best option for the largest res improvement at this stage. It may work, but may not. I think it will work for such high res on the falcon.
As an example for a 100A particle of mine, K3 data. It goes to <2.5A with patch motion + local motion. Then process fresh with MC2, autopick same settings, cleaned up particles go to about the same res, slightly worse (don’t be discouraged at this stage). Then bayesian polish train, polish with those settings, and shiny particles get ~0.25A better. Second round of polishing adds another 0.06A.
You will need to re-run mc2 in relion and the whole process will likely take a few days but well worth it if you want to push res.

schow · February 1, 2021, 2:14pm

Hi Aaron,

For my reconstructions, my boxsize is 456 (0.55 A/pix in super-resolution mode) but I have binned by 2 and so it is 228 and my map right now is ~3.8A. To try larger box sizes for particles, would you have suggestions for what increments of box sizes I could try?

Thank you

tarek · February 1, 2021, 8:58pm

Hi schow,

456 is a weird box size. For computational reasons I would choose a different number with better divisor. Find inspiration here:
https://guide.cryosparc.com/live/new-live-session-start-to-finish-guide

Estimation of an appropriate box size to sample the full CTF can be made here easily.

Best,
Tarek

user123 · February 5, 2021, 4:26pm

Hi schow, see Oliver’s post about optimal box size here. You can see it does depend on defocus. You can always test, say going up to 512 if your gpu can handle the refinements.

DanielAsarnow · February 5, 2021, 6:54pm

It sounds like your Nyquist is 2.2A, which is pretty far from your 3.8A nominal resolution. Is the particle close to the edges of the box? Less than 30A from the edge of the circle mask (85% of the box diameter by default)? Unless the bounds are very tight, further classification or focused processing, etc. are probably going to be more useful. (And it will be faster with the smaller box size, though you might change to 512px / 256px which is actually a bit faster).

schow · February 5, 2021, 9:04pm

Hi Daniel,

I re-extracted my particles with 512pix (bin/2) box size but the overall resolution and map quality did not change much. My particle is membrane protein and come to think of it, the edges of the micelle do indeed come close to (probably within 30A) of the circular mask (using the 85% default). May be I should try an even larger box size (~540/560?) or increase the circular mask dia to 90-95% box size and see if there are any improvements. Is this reasonable?

I have tried to reclassify/refine but the resolution usually worsens (probably due to loss of particles/heterogeneity issues)

DanielAsarnow · February 5, 2021, 11:19pm

Using a wider circle mask is easier - give it a shot. In practice, I have not seen a wider box improve resolution for any of my projects, even when my initial box was a bit tight. Unless you’ve collected only at defocus > 1.5 micron, you likely have lots of particles with 500 nm - 1 um defocus. These have less CTF delocalization and better CTF envelopes so probably contribute the most to your high resolution signal anyway.

If you really do think you need a much larger box, remember that you don’t need to do integer binning. I often use 540/360 or 432/288.

Most likely though, you are only limited by heterogeneity and/or alignment accuracy. You may need to combine your non-uniform refinement results with e.g. classification without alignment in cisTEM or Relion, or perform more extensive focused classification in those programs, or apply 3DVA or local refinement in cryoSPARC in order to improve your resolution further. With Nyquist of 2.2 A, you should not need to unbin until you reach better than 3 A.

schow · February 6, 2021, 1:48pm

Tried the wider circle mask - the resolution did not improve but the B-factor reduced from 98 to 94 (after Homogenous ref and Local ref with NU ref) although I’m not sure whether map quality is improved.

Increasing box size from my previous “weird” choice of 456/228 to 512/256 did lead to resolution improvement by 0.05A and map looks slightly better in a few places. (85% circular mask)

I tried to increase the box size further to 600/300 and the resolution worsened by ~0.3A and map quality was also worse (85% circular mask). Not sure why this happened. (larger box = more noise outside particle and worse SNR?)

To tackle heterogeneity on cryosparc, I have been mostly using a combination of heterogenous ref (take particle set and use multiple copies of the same reference map) and Ab initio reconstruction (choose the particle stack with best features for downstream refinement). Is this reasonable? (Haven’t tried to play with 3DVA as yet)

DanielAsarnow · February 7, 2021, 2:07am

There’s nothing wrong with 456px, it’s just a bit slower than some other nearby sizes like 432 or 512, due to the way the FFT works. Small changes in B-factor and nominal FSC are to be expected. You will get changes of that magnitude just by running the same refinement multiple times - even if you use the same random seed. GPU computations are intrinsically non-deterministic due to floating point errors.

I agree about going to too of a box influencing refinement, especially if there are neighboring particles or a strong background (e.g. high salt).

Sometimes there is no solution other than classification with local search or without alignment, in which case you will have to try something other than cryoSPARC for now. Or perhaps you are now limited by your biochemistry/intrinsic dynamics. If you have > 150,000 particles I’m sure you can get a little further by focused classification, especially if there are some specific regions that are less well resolved.

tarek · February 7, 2021, 8:34am

For more sophisticated guidance we need better information about your sample, FSC curves, map slices etc.
Giving your pixel size you should easily achieve better than 3.8A if your sample/data allows.
Just for comparison: I mostly run refinement with bin2 (1.7 A/pix) until resolution reaches Nyquist (3.4A).
If you can not get there something else is limiting.

cheers

schow · February 7, 2021, 1:11pm

My sample is a 250kDa membrane protein. Under my biochemical conditions it is quite stable but its not like apoferritin which readily gives ~Nyquist reconstructions. So I would be (pleasantly) shocked if I get that resolution with my protein.

In any case, I am not particularly short of particles. My refinements thus far which give ~4A maps use 500K+ particles and my protein is C2 symmetric. So I am cautiously optimistic that I might be able to push the resolution further. Also, while my overall res is ~4A, the core membrane domain is at around 3.3-3.6A. There are other peripheral domains which are res 5-6A. I have tried using masking on cryosparc (local resolution/particle subtraction) but I almost always get masking artifacts and I’m not sure how to resolve it but may be I have not tried hard enough.

Before moving to relion I wanted to see if there are other suggestions on classification in cryosparc but it seems that I might be running low on options on cryosparc at this point. Thank you @DanielAsarnow and @tarek

diffracteD · February 9, 2021, 8:27am

@schow Same here. With C4 implied, I’m also getting a 4.5A smooth FSC convergence but lots of really strong micelle densities which I’d like to believe artifacts due to masking. Not sure how CS is dealing with micelles !

tarek · April 9, 2021, 4:51am

Cryosparc is doing fine with micelles. You can find many published structures of membrane proteins solved with csparc, we have good experiences too.
If sample/data allows sub 3A reconstructions should routineously be possible.

schow · April 12, 2021, 5:49pm

@tarek Would you have some suggestions on masking methods? I usually keep the threshold between 0.1 and 0.2 and the near/far parameters either 3/6 or 6/12 (usually dynamic). Anything special you would recommend trying?

tarek · April 15, 2021, 7:42am

@schow I hardly change the default masking parameters during cryosparc refinements. I’m not a big fan of signal subtraction, therefore I can not comment on this.
For local refinement masks I choose a contourlevel in chimera that is covering the interesting regions generously, mostly after LP-filtering. The segger tool helps a lot cleaning artifacts.
With the CS volume tool in most cases dilation/softening of 2/2 is doing well.