Avoiding over-refinement in Local Refine New

PVK · April 10, 2021, 6:00pm

Dear all,

I am using the new local refine on a protein with rotational symmetry. I tried entering the symmetry in the input as well as a combination of symmetry expansion and then refinement of the symmetric density with C1. For info, inputing the symmetry group with the original subtracted particles looks better.

The improvement in resolution is really nice after subtracting a large but heterogeneous part of the assembly. The refinement looks very good at iterations 4-5 where I can reliably build the sequence in the map (resolution 3.4-3.5A, fold is fairly conserved). However the refinement goes on and on with many iterations and whereas the FSC does not look bad at each round , the sharpened map looks fuzzier and fuzzier and some weird protrusions start to refine in the density. So I am wondering, is it a better strategy to use the results from the intermediate iterations and if so, how to get the corrected FSC curve, or should I just sharpen the final map at a more conservative resolution where I can trace the residues well (again if the latter case, how to get a more reliable FSC curve or at least justify the resolution cut-off in a more objective manner).

Thanks in advance for any insights,

Petya

mmclean · April 12, 2021, 10:38pm

Hi @PVK,

It’s definitely a valid strategy to use intermediate iteration results, especially if you suspect that the later iterations are over-refining. Unfortunately despite their practical uses, masks can cause problems like this in refinement / subtraction, because they break the independence assumption between each half-map. The best way to overcome this is to use static masking, and to make sure the mask has a very soft edge, e.g. the “Soft padding width” parameter in the Volume Tools job is as large as reasonably possible. The softer the mask, the lower the resolution at which the info in each half-map becomes coupled. Also for symmetry, you should be careful that with the symmetry expanded stack you aren’t re-entering the symmetry parameter into refinement (it should always be C1, and the “Force re-do GS split” parameter should always be off).

If you want to use intermediate iteration results, you can re-import each half-map (at whichever iteration you like, from the job directory) using Import Volumes, and then re-import the mask you want to use for FSC computation too. Then you can run a “Validation (FSC)” job using both half-maps and the mask: this will give you all the FSC curves usually computed in a refinement, including the corrected one. You can also use the “Sharpening tools” job to sharpen the map to any B-factor.

Best,
Michael

PVK · April 13, 2021, 5:36am

Hi Michael,

Thanks a lot, this will be very useful.

So I checked my jobs and in the one that ran best the mask is static, and I generated it with Chimera molmap from an initial pdb model (homology model and refinement against the starting NU-refined volume at backbone-ish resolution). I imported the volume and converted to a mask with a low threshold which was already quite “swollen”. However I noticed I only padded at 7 so I will try increasing this to 10 or even 12. For the symmetry expansion, I have made that mistake before, so indeed, the expanded stack is at C1 and, luckily :), the “Force re-do GS split” was OFF (default). However, I was wondering, the new local refinement allows for quite large shifts, if I am interpreting the “Magnitude of alignment shift” plots correctly, so is it possible that between iterations, the same protomers shift between symmetry related positions and are hence considered more than once, as duplicate particles? Or I guess this is taken into account with introduction of symmetry input in the new local refine…

Another question on soft masking: I had better results first doing particle subtraction, and for the subtract mask, I used my initial pdb-molmapped volume, thresholded even lower to inflate it further, “invert mask” in the mask generation, and very low soft padding (2) as I noticed the padding width goes towards the volume of interest to be preserved. Is it OK to use higher padding but with a negative value (-10, let’s say) for the subtract mask so to pad outwards of the remaining volume of interest?

Finally, thanks for the detailed info on using intermediate results, it’s exactly what I was looking for to get the best map to build in a better starting model and redo.

cheers,

petya

PVK · April 14, 2021, 11:32am

Hi Michael, a couple of more questions:

for getting the corrected FSC curve from intermediate refinement iterations, which one is the mask that I need from that iteration outputs?
also, what is the best way to get a filtered map based on local resolution estimation, rather than global one? I noticed that there is no map_filtered output in local refinement.

thanks in advance,

petya

mmclean · April 14, 2021, 3:57pm

Hi @PVK,

It sounds like all your masking and symmetry related parameters are ideal. The “Magnitude of alignment shifts/poses” plot shows actually the overall deviation from the initial shifts/poses that were input to the refinement. It is technically possible that with a symmetry expanded stack, one particle can rotate enough such that it is superimposed over a duplicate particle when averaging. But in reality that could only happen with symmetry groups of high order, like I, O, or a high order D or C (the pose deviation would have to exceed the angular magnitude of one of the symmetry rotations). In any case, duplicate particles within the same half-set is unlikely to cause overfitting in the same way that it would if they were in different half-sets.

For your question on inverting the mask, yes I think with inverting the mask the padding happens after the inversion, so the behaviour you see is expected. I’m not sure if you will get the result you expect with a negative dilation/pad value, unfortunately – this hasn’t been tested. As a workaround, you could take the mask into Chimera and use vop subtract to subtract the mask from a map of all 1’s, and then re-import it.

With regards to the corrected FSC curve – If you want to use the same mask used during FSC computation in the refinement, you’ll have to use the mask_fsc output from that intermediate iteration’s results, which should be locatable from the volume output group.

The best way to get a filtered map from a local resolution estimation output is to run a “Local filtering” job with the output of a local resolution estimation job. The local resolution/filtering subroutine is only used in the legacy non-uniform refinement job and not in any other refinements currently.

Hope this helps,
Michael

PVK · April 17, 2021, 8:53pm

Hi Michael,

Thanks again for the fast and informative reply.

Petya