Optimizing 3D-flex parameters


During 3D-Flex training, based on the tutorial it seems like there are several parameters that frequently need to be optimized. Apart from optimizing the mesh, there is the number of latent dimensions, the rigidity (lambda), the latent recentering strength, and the noise injection std. dev.

With large datasets, each 3D-flex training run is computationally quite expensive, so I am wondering if there is any advice on how to optimize these parameters in an efficient manner? Can these parameters be reliably optimized using subsets of data, or do users need to do a full run with all particles for each parameter choice?

Secondly, is there any order in which these parameters should be optimized, and are they inter-related - can they be screened as independent variables, or should users try a grid search for each pair of parameters?

And regarding the number of latent dimensions, can this reasonably be approximated as the number of modes in 3D-VA that show believable/significant conformational differences?

One more thing - in the paper, the number of MLP layers was a parameter that was tweaked for the TRPV1 case (reduced from 6 to 3), but I do not see a way of altering this in the GUI… is there any way to change this?

Finally, for membrane proteins embedded in the membrane, is it recommended to minimize the volume of micelle/nanodisc covered by the mesh? I imagine this will have kind of stochastic compositional variability which may cause issues…?

Apologies for the barrage of questions! We are excited by the potential of 3D-Flex for resolving complex conformational landscapes, and are trying to figure out how to get the most out of it :slight_smile:



Hi @olibclarke! This is a great question, and we can’t wait to see what you’re able to do with 3D Flex. Here are our recommendations for training as of today:


Subsets should work for tuning the parameters, with the following constraints:

  • The subset particles have the same box and pixel sizes as the data set that will eventually be used to train the “full” model.
  • The subset captures all of the modes of variability that will be encountered in the full dataset.

With those constraints, it ought to be possible to tune just about everything (including mesh topology) with a smaller particle stack.

Parameter optimization

While there is not a specific order in which you must tune things, I do have a few recommendations/tips/comments for each parameter:

  • Most of your time will be spent tuning the rigidity, both of your mesh and of the training job. Unfortunately, this is an empirical problem (everyone’s favorite). As I’m sure you already know, if you’re not seeing enough motion, rigidity needs to come down. Non-physical or noisy motion, rigidity needs to go up.
  • If the latent space does not appear smooth, you can increase noise injection to smooth things out. It is generally desirable to have a smooth latent space, but as particles become larger, a more “jagged” latent space becomes more and more necessary to capture the true motion of the particle.
  • Hidden units can generally be left alone, unless significant noise is observed that is not corrected with more rigidity (or a mesh with different ridigity/topology).
  • Centering strength is independent of the other parameters. It can be tuned completely separately until the particles fully occupy the [-1.5, 1.5] space in all dimensions.


You’re right that ultimately, the same number of dimensions are needed for 3DVA and 3D Flex to capture all the degrees of freedom of your particle. However, we still recommend that you start training with two dimensions. 3D Flex is very sensitive to heterogeneity and partial occupancy, and it’s nice to notice these and other problems in the much cheaper 2D training job than an N-D job.

In essence, while theoretically the same number of dimensions are required, we still recommend determining how many dimensions 3D Flex needs empirically rather than jumping straight to the number you discerned using 3DVA.


While we’re on the topic of 3DVA, we generally recommend initializing the latent space with all particles at 0 rather than using the 3DVA coordinates. This acts as an “independent” conformation that your heterogeneity is present. One exception is particularly small particles, which may benefit from 3DVA initialization.

MLP Layers

We have found that tuning the layers can do more harm than good, so there’s not currently a way of altering this parameter.


You’re right that entirely masking out the micelle can cause some issues. We recommend including the micelle/nanodisc/what have you with the mesh, but setting those regions as rigid. This way the model will still explain these parts of your map, but not try to move their density around.

Do note that if the micelle is set rigid, any segments it is fuzed to will have restricted motion at those interfaces.

I hope that’s all helpful! Please let me know if you have any more questions!