Suspicious of improved resolution post-3D flex

kchilde1 · March 23, 2023, 7:06pm

Greetings,

I’ve been playing with 3D flex and can successfully run through the entire pipeline without issues. My concern is the substantial improvement to my resolution through 3D flex. Prior to 3D flex, my particle plateaued at 4.2 A. After running through 3D flex w/o changing the parameters (300 crop box, 150 training box), I found that my resolution significantly improved to 3.1 A. I have confirmed this by running an FSC Validation job. Yet the map quality does not line up with a 3.1 A map. I would expect to see at a minimum backbone tracing, but this is not the case. In fact, I would argue that the overall map quality is slightly worse. Is this expected?

For reference, my particle is a ~180 kDa complex composed mostly of beta strands and loops (only a couple, very short alpha helices, unfortunately). One of the domains (~20 kDa) is quite flexible, undergoing a 20 degree swinging conformation which we have previously visualized via crystallography. I’ve also performed 3DVA and can see the domain swinging between both conformations. I wanted to try 3D flex as an alternate route to resolve this flexibility and have run into this “improved” resolution issue.

Any explantation for this would be much appreciated. At a minimum, I thought this would serve as a helpful warning to others seeing a similar improvement to resolution post-3D flex.

leetleyang · March 23, 2023, 10:30pm

Out of curiosity, what was the pixel size at the 150p training step?

kchilde1 · March 24, 2023, 7:29pm

Pixel size is 1.576 A.

leetleyang · March 25, 2023, 4:40am

I see.

In all likelihood, your model was trained on a substantial amount of high frequency noise. Training doesn’t benefit from appropriate filtering in the same way refinement does from gold-standard FSC. It’s perhaps not a coincidence that 1.576 x 2 ~ 3.1Å.

FWIW. To mitigate overfitting, it’s suggested in the tutorial to pick a training box size (and resultant downsampled pixel size) that gives a Nyquist limit numerically higher than the resolution estimated from refinement. The rationale is that if information beyond that Nyquist frequency is subsequently recovered during reconstruction, it’s more likely to be genuine.

Cheers,
Yang

apunjani · March 27, 2023, 5:27pm

Hi @kchilde1, I think that @leetleyang is correct - you do have to be careful to limit training to a resolution that is numerically somwehat above the resolution that you are getting in rigid refinement, so that you can be sure that training does not cause overfitting affecting the FSC after reconstruction. It’s also worth noting that it’s unlikely that training will be using information at 3A to fit the flex model so there wouldn’t be much downside to limiting training to eg. 5A+. It would also be faster to train!

kchilde1 · March 31, 2023, 9:19pm

Excellent suggestions from both of you - much appreciated! I will go back to the training module and adjust the parameters as you’ve described.