Using 3D Flex to refine a small, unrestricted domain

KevinDAmico94 · April 9, 2025, 8:51pm

Greetings, I’m looking for some advice on how to optimize my 3D Flex training, if possible.

Backstory:
I have a ~150 kDa protein complex and the main, ~120 kDa region resolved to ~2.7 angstrom. But the hard part to refine has been this accessory domain; its only 22 kDa and connected by two very flexible linkers. 3D classification from my initial 1.5 million good particles gets this domain to maybe 8 angstrom. So I’ve been trying to sort out if I could improve this with 3D Flex refinement. For now I am using 10% of my particles, as a test case to speed things up.

Question part:
After making a custom mesh from my segment map (I kept the main domain rigid), I ran flex training with 2 latent dimensions and am still fine-tuning latent centering strength. My current best map still shows a ton of smearing, however. Some questions I have are:

How can I tell if I need to increase the number of latent dimensions?
Should I be worried that my full/rigid validation don’t diverge?
Am I delusional for trying to refine something this small and this flexible to begin with?

Thank you,
Kevin

rwaldo · April 14, 2025, 5:22pm

Hi @KevinDAmico94, welcome to the forum!

Before talking about 3D Flex, I wonder if you could tell me a bit more about your sample and what other analyses you’ve tried.

In your 2.7 Å refinement of the main region, do you see the expected secondary structure for this resolution? E.g., clearly see the main chain and some sidechains?
Do you expect the smaller accessory domain to be internally rigid?
Have you tried a Local Refinement with a mask containing the accessory domain and a small amount of the main region?
Have you tried 3D Variability Analysis? If so, did you see believable motion of the accessory domain?

KevinDAmico94 · April 15, 2025, 3:22pm

Thank you for reaching out! I included some more images to help answer your questions

The main region is a very solid 2.7 Å. Images A-C show the best map at different contours/zoom. This is 1.4 million particles, using non-uniform refinement. But, the accessory domain is completely missing from those maps. I only see it with homogeneous refinement (D). If I filter the map to 8 Å for 3D classification, I can see it better at a low contour (E). After 3 rounds of focused 3D classes, I can see the accessory domain better but I only have 50k particles left at this point.
The accessory domain should be rigid, but has been crystallized in “open” and “closed” conformations. The difference is subtle. The domain itself is a homodimer, so I’ve tried making segments where the whole domain is one segment and others where I divide the two subunits.
I haven’t tried local refinement for this dataset, because I didn’t have success with that technique on this same protein in the past. I was also worried about size, since it’s 22 kDa. Here, I’ve only tried 3D Classification and 3D Flex.
Did run 3DVA, but did not get motion that I expected. Instead of swinging and twisting on the linker regions, it stayed in place and pulsed slightly. The only motion I saw was some very slight “breathing” in the main domain.

Tried to keep it as concise as possible… thank you!

rwaldo · April 15, 2025, 6:00pm

Hi @KevinDAmico94, thanks for those images, that makes it much easier to figure out what might be going on!

Your map in C looks beautiful, nice work! That definitely looks like 2.8 Å, so we can be sure you’ve got some nice particle images.

However, do you notice the “spikey” density in D? You can almost see a border where the normal map density ends, there’s a little gap (I’ve highlighted it here), and then spikes appear.

These types of spikes often appear when the map is overfit. Often it’s harder to see these in Non-Uniform Refinements, so it makes sense they’re only really visible in D.

Typically, overfitting is caused by two things:

A mask that’s too tight
Junk particles still in the particle stack

If your mask is too tight, your corrected GSFSC curve will never rejoin the tight curve. Do your GSFSC curves look like the ones in that guide page? If so, you may want to make your own mask (we have guidance on that here).

As for junk particles, you might want to try doing some more 3D curation of your particle stack. Since the main part of your map looks so good your poses are likely high quality, so 3D Classification may work well. You could also try iterated Ab-Initio Reconstruction and Heterogeneous Refinement. We cover both of these approaches in the TRPV1 Case Study (there’s also a link to a workshop recording there if you prefer video).

3D Flex is very sensitive to junk particles, so you definitely want to get your particle stack as clean as you possibly can before moving on. Once the stack is clean, I might try a Local Refinement with all of your cleaned particles and a mask just around the accessory domain and a bit of the main region (perhaps something like below). This will encourage the alignment algorithm to focus on this flexible region – hopefully you’ll see the main body blur out and the accessory domain improve. We have examples of this in that same TRPV1 case study (see the CTD section) and the dedicated Local Refinement case study.

I know that’s a lot of information, I hope it’s helpful! Please reach out again when you have more questions!

KevinDAmico94 · April 17, 2025, 7:21pm

Thank you for all of the detailed suggestions! After reading your comments, I revisited my GSFSC curves and noticed that my corrected and tight curves diverge like in the page you linked to (at least, in respect to the auto-tightened curve). To try and correct this, I have:

used iterative ab-initio reconstruction (3 classes) and heterogeneous refinement to remove junk. Round 1 gave a high-res class, a low-res class, and a junk class. I kept the high-res (~950k/1.4m particles or 68%). Round 2 gave me 1 good class (830k, 87%) and two junk classes. Round 3 gave me three classes that were all my complex but at reduced resolution, so I stopped there and stuck with my 830k stack. This is probably the most-detailed map I’ve seen at 2.8 Å
followed the tutorials to make iteratively bigger masks. I’m not sure I’m having success with this yet. In the attached picture I’m including the same particle stack refined with an auto-generated mask, and then masks dilated 2-15 pixels (I tried numbers in between; these were the extremes). The “spikes” might be getting smaller, but my tight/corrected gap is actually increasing on the auto-tightened GSFSC (lower plots)

It seems like I can’t reduce my particle stack further without losing resolution, and I’m not sure increasing the mask size has had the intended effect. I’ve tried dilating the auto-generated mask and making my own from scratch with Chimera, but the results have been the same.

rwaldo · April 17, 2025, 9:38pm

Hi @KevinDAmico94, glad that was helpful! Your iterative curation procedure sounds spot on, and glad to hear your map is looking even better.

These curves don’t look too bad to me – I’d stick with either the auto-generated mask or your dilated 2 padded 12 mask (in general, I’d only dilate a few pixels and then use generous padding rather than increase the dilation).

The spikes are definitely still visible, but they look a bit better – you might want to try other curation jobs like 3D Classification, or perhaps Heterogeneous Refinement with two copies of your current map. Or you can call this good enough for now .

Now, as for your accessory domain: 22 kDa is going to be pretty tough no matter how you slice it. I might try a 3D Variability Analysis (3DVA) job with a very coarse filter resolution, maybe something like 16 Å to start. You’ll need to create another mask here that covers your entire main domain and also covers the entire area you expect the accessory domain to move around in (not just where you see it in the consensus refinement).

If you see the accessory domain moving around in 3DVA, that’s good news! You could try using intermediates or cluster mode of 3DVA Display to pull out groups of particles with the accessory domain in the same place and peforming Local Refinement of those. As I said, though, such a small domain is going to be somewhat challenging.

If you want more information about 3DVA, we covered the theory and practice in another workshop recording.

Edit to add: you could also of course try the Local Refinement I mentioned in my previous post!