Symmetry expansion workflow for 3D classification

I’m working with a homodimeric enzyme that has a global C2 symmetry, but I suspect that there may be symmetry breaking features at the two active sites. Consequently, I’ve been experimenting with different ways to resolve that asymmetry using symmetry expansion followed by 3D classification without alignment, but I have some questions on the proper way to do this.

My current workflow is as follows: 1) align consensus refinement to the symmetry axes, 2) symmetry expansion, and 3) 3D classification on the expanded particles.

Questions:
1. Is the number of classes for 3D classification related to the degree of the symmetry expansion? For example, I’d imagine that with C2 expanded particles sorted into two classes, both classes would be identical and related by a 180 degree rotation about the symmetry axis. In other words, instead of classifying the particles into conformational states, 3D classification effectively just undoes the symmetry expansion. Based on my experiments, this appears to the case. So, for a Cn symmetry expansion, should the number of classes always be greater than n (or some multiple of n)?

2. Building on the previous question, if I took the particles from just one of those two classes and then did a further refinement, would this workflow effectively be the same as doing a refinement on the original, unexpanded particle set with C2 symmetry relaxation? Are there any advantages/disadvantages for either method?

3. What other workflows are possible after the symmetry expansion step (besides 3D classification and 3DVA)? If I were to immediately do another refinement (with C1 symmetry), it seems like this would effectively be the same as doing a refinement with symmetry enforced on the original, unexpanded particle set.

Hi @cbeck! These are some great questions about symmetry! I’ll try to help answer these, and others in the community might have more to add too.

To clear up writing, I’ll refer to your enzyme as having subunits “A” and “B” (the two monomers which make up your C2 dimer) and your alignment as having positions “left” and “right” (referring to whether that subunit is aligned to one symmetry-related position or the other; left and right have no biological/physical meaning). A and B are identical, but it can be useful to be able to refer to the identity of a subunit independently from its aligned pose.

On to your questions:

Is the number of classes for 3D classification related to the degree of the symmetry expansion?

In general, I agree with your reasoning here, but have a few clarifying comments.

Does classification undo symmetry expansion?

First, if you perform a Cn symmetry expansion and then classify with N classes, it is not quite the same as undoing the symmetry expansion. Consider your case of a C2 particle. Your particles are likely in some kind of mix of A/B and B/A in left/right positions.

If you perform symmetry expansion, you will have equal numbers of A/B and B/A particles (since each image is copied and rotated to the other positions). If you then classify into two classes without alignment, a perfect classification would result in two classes: one which is 100% A/B and the other which is 100% B/A. Picking one of these two classes would result in a stack which has each particle only once, in the same orientation (A/B or B/A) as the other particles in that stack.

So in some senses, we have undone the symmetry expansion — the particles are again only in the particle stack only once. However, some of the particles’ poses have flipped so that all poses are consistent. Again, this assumes that classification is perfect!

Should the number of classes always be greater than N?

This depends on what you want to do! In the case I outlined above, essentially using 3D Classification to put all particles in the same orientation, you only want there to be N choices for each particle (it’s either in the A/B class or the B/A class).

If you’re classifying on something more complex you will need more classes because of the symmetry, but this is true even without symmetry expansion.

Consider again your C2 case. Say your subunit A has a 50% chance of being a different conformation (let’s call it A’) than your subunit B. Without symmetry expansion:

  • some unknown fraction of your particles are oriented A/B, and the rest are oriented B/A
  • 50% of A is in a different conformation

therefore, 3 classes are needed to capture the variation in your particle stack:

  1. A/B (since A and B are identical, this class also captures B/A)
  2. A’/B
  3. B/A’

If you perform C2 symmetry expansion before classifying, you know that all of your particles are in both A/B and B/A orientations (put another way, you know the exact fraction of particles from the first bullet point: 50%). Again, you know that 50% of A are in conformation A’. You can see that the same three classes are needed — you would just expect that more particles are in each of the classes.

For this reason, we often recommend a slightly modified workflow from what you’re suggesting:

  1. Perform symmetry expansion
  2. Mask out a single subunit
  3. Classify just one subunit position. Because of symmetry expansion, you are actually classifying all subunits.

Using this technique, you can use only two classes:

  1. A (since A and B are identical, this class also captures B)
  2. A’

This is only a moderate savings for the C2 case, but consider a C4 case with A/B/C/D, and again with 50% of A being A’. You would need 5 classes in the un-expanded case:

  1. A/B/C/D (which captures all four orientations that do not have A’)
  2. A’/B/C/D
  3. B/C/D/A’
  4. C/D/A’/B
  5. D/A’/B/C

and still only two in the expanded case with a single subunit masked out:

  1. A (captures B, C, and D as well)
  2. A’

I know that’s a lot of alphabet soup, but I hope I’ve made it clear that in some cases, symmetry expansion can actually reduce the number of classes you need. This is in fact one of its more useful applications. You get to classify on specific features of the protein without having to worry about which of the N equivalent positions that particular subunit got aligned to in the early steps of processing.

After performing these analyses, you can come back through and count the number of subunits in each class, per particle, if you wish. An example of this workflow is here, if you’re interested: cryosparc-examples/symm_expand_filter.ipynb at main · cryoem-uoft/cryosparc-examples · GitHub

Is refining just one of the two classes the same as enforcing C2 relaxation?

In essence, yes, refining a class of particles which are all in the same orientation will be the same as relaxation. You’d want to do a Local Refinement here, so that no particles flip back around to the wrong orientation. Also, since classifications are not always perfect, you should probably run a Remove Duplicates job to make sure that there are no particles which ended up entered into the same class twice.

What other workflows benefit from symmetry expansion?

You may already know this, but you should never perform a global refinement (such as Homogeneous or Non-Uniform Refinements) with symmetry expanded particles, since this may result in duplicate particles and corrupt your GSFSC.

If you perform a local refinement of symmetry expanded particles, it is similar to but not the same as performing a global refinement enforcing C2 symmetry. This is because when you enforce C2 symmetry, you force the map to be symmetric. When you perform a refinement, typically one of the two subunits will align better than the other. This is especially true if the subunits are flexible. In this case, half of your map would be of higher quality than the other, since the particles are all aligning there. This results in a map with a higher-quality A and a lower-quality B.

A common symmetry expansion workflow is to perform the masked classification I describe above to get a population of images for each possible conformation/composition of the monomer, then perform local refinements of those particles using the same single-subunit mask. This gives the highest-quality map you can get for each of (say) A and A’.


That’s a lot of information, but I hope it is helpful! I’d be happy to discuss any more questions you have!

9 Likes

Thank you so much! This perfectly answers all of my questions. I always appreciate your detailed writeups whenever I encounter them on the forum, and I’m also glad you linked your GitHub page - I wasn’t aware of it before and I’ll definitely dive into the different examples that you described there.

3 Likes

Hello,

Im also dealing with a homodimeric enzyme with C2 symmetry. I’m wondering, what if A has a different conformation: A’ but B also has this same varying conformation B’.
I’ve been classifying directly after C2 refinement and I obtain 3 dominant classes that I assumed were A/B, A’/B and B’/A. But actually what I thought was A’/B could be a mix of A’/B and B’A, and B’/A could be a mix of B’/A and A’/B…
I don’t know how to be sure that chain A is actually A and B is actually B (A’ is actually A’ and B’ is actually B’).

Thanks is advance to anyone who can help!

Hi ncooper,

I’m not sure I follow - if the complex is a homodimer, shouldn’t A be identical to B, and A’ should be identical to B’? If I’m understanding correctly, A’B should be identical to B’A, leading to three possible types of complex:

  1. A/B
  2. A’/B and B’/A (these are identical, and also identical to B/A’ and A/B’ if rotated about the C2 symmetry axis)
  3. A’/B’ (you don’t mention this state existing, but I’m listing it here for completeness sake)

Best,
cbeck

Hi,

Thank you for your response!
I see what you mean when you say A’/B and B’/A are identical. Even though they are identical in terms of conformation of the complex, is it correct to say that they correspond to two seperate 3D classes since chain identity varies? I’m new to publishing cryoEM data and I was thinking of showing A/B, A’B and B’A as 3 different classes, highlighting that there is a degree of independance between the subunits, with a set of particles in the dataset captured with A in conformation (‘) and B (), and a seperate set of particles captured with A () and B (’).
But maybe this is incorrect and there are only two real 3D classes: 1) A/B and 2) A’/B = B’/A (I don’t have A’/B’)
I don’t know if I’m clear…

Thank you in advance.
Best

Ah, I think I better understand what you mean. But when you compare A’/B and B’/A, these are still the exact same class because A’ is identical to B’, right?

The three classes that you should see in 3D classification would be:

  1. A/B (identical to B/A)
  2. A’/B (identical to B’/A)
  3. A/B’ (identical to B/A’)

Here, classes 2 and 3 represent the exact same conformation, but they classify into the different classes because one is rotated by 180 degrees relative to the other. So yes, in this sense, there are only two “real” 3D classes as you say: A/B and A’/B = B’/A

I don’t think there would be an issue in showing all three classes, but because 3D classification doesn’t do any alignment or refinement, you likely wouldn’t achieve the highest possible resolution. It might be more productive to do the following:

  1. Locally refine class 1 to a higher resolution now that it’s conformationally “pure”
  2. If the conformational change is large enough between A and A’, you should be able to use Align 3D Maps to automatically align the particles from class 3 to class 2, which would undo that 180 degree rotation. Now that the particles from class 2 and 3 are in roughly the same pose, you can do a local refinement. Since the particles are conformationally “pure” and you’ve effectively doubled the number of particles by combining classes 2 and 3, this could lead to improved resolution.

I would show the maps for steps 1 and 2. This would probably give you a higher resolution than just showing the 3D classification reconstructions.

Cheers,
cbeck

Hi

Yes A’/B and B’A are the same…
Great, thank you for this valuable information! I tried the align 3D maps job, it seems to work well!
I will follow your advice

Best

1 Like

Hi all,

Firstly, this thread is incredibly helpful!

I realise this thread has been marked solved but I’m wondering if I can test the logic in the C4 case?

By performing symmetry expansion, masking out a single subunit (I take it here you mean particle subtraction?), and then classifying just one subunit position (locally refining each class), we are able to look for N number of states at a given position.

In the example given that is 2 different states (A and A’, lets say inactive [A] and active [A’]) in two classes. The outcome of that classification job tells us the proportion of inactive [A] and active [A’] at this specific position in the tetramer.

To what extent is it then necessary to repeat this process at the other 3 subunit positions. If you accept that subunit position A is ‘representative’ of the other subunit positions in terms of their inactive/active states (in so far that A=B/C/D and A’=B’/C’/D’), looking at each subunit position is presumably then unnecessary? You would expect to find the same split in each subunit position.

My guess is that it would be necessary however if you think that the distribution of these states is heterogeneous within your tetramer. Subunit position A is 50/50, B is 75/25, C is 10/90, etc. Although, presumably in this case, it would be possible to use source_id to find which % of your original particles in position A, B, C exist in each split (inactive/active)? Or, if you suspect that A’ =/= B’/C’/D’ for whatever reason though one doesn’t come to mind right now.

Hopefully that makes sense. Symmetry sometimes gives me a headache…

Hi @oxymoronic! Symmetry can be really confusing, especially since all the terms sound so similar!

First, a small clarification – when I say “mask out” I just mean creating a mask that surrounds a specific subunit. I do not usually perform particle subtraction.

The nice thing about symmetry expansion is that it rotates each individual particle according to its symmetry, so when you check the state of one position you’re actually checking the state of all of them. I find it helpful to think of symmetry expansion as the mask remaining static, but the particles rotating so that each subunit ends up in the mask once:

symmetry-expansion

This is a good way to get a map of a single subunit of each state. Things become more complicated if you want to produce maps of the entire molecule, and each molecule has an unknown arrangement and/or unknown number of subunits in each state. We have an example of one way to do this with cryosparc-tools, but there are a number of good solutions – it is something you’ll have to work out for your sample!

One final note – if you haven’t already, you may want to check out the recordings of a recent workshop we ran at S2C2! The first three case studies all focus on symmetry questions like the ones you’re asking!

3 Likes