I am processing a large molecule that I know some particles have an extra chain (20aa) and some not. I would like to separate all particles that have the chain from the empty ones. I think I have a mixture at this moment and therefore my resolution is low ~6A. I tried several different ways of focus classification with jobs 3D Classification and 3D Variability. Do you think that is possible to solve this problem with processing tools?
My normal focus classification strategy:
Consensus refinement → create mask around 20 aa chain (normally 3 px dilatation and around 10 to 20 px soft padding) → 3D classification job with mask.
For 3D classification common parameters that I tried changing:
Target resoultion (3-10 A)
Class similarity (0.1-1)
Number of classes (2-30)
initialization mode: PCA
Particles per reconstruction: 2000-20000
Force hard classification: true
For 3D variability common parameters:
Modes:3
Filter resolution:3-10
If anyone would have suggestion they would be much appreciated.
I also tried unsuccessfully doing local refinement before 3D classification job.
if the presence of the helix causes global conformational change, this should be able to separate automatically in 3D class with no mask.
if it is just the presence or absence of the helix, then you are on the right track and you sort of have to get lucky. When doing focused classification, I would have the mask include ~50kDa at least, and include the helix. do not use a mask that includes only the tight local area of the helix.
it sounds like you need more data. 1million particles at least is better. target resolution between 6 and 10. class similarity 0.1. number of classes ideal if 40 or more, but need like 20k particles in each class (hence requirement for lots of data). PCA start with 500,1000,2000 all should be fine, more not necessary. Local refinement will only work on at least 50kDA domain or more.
if you get a 3D class with helix, you can “seed” het refine with that class.
I have a similar project, sorting the small helix of interest from all the rest of the data. my strategy has been to run 100 classes, then select all classes with the helix (maybe 20 of them) and local refine, then classify again. But there is no exact science for this. Particle subtraction may help (get rid of most of the big complex leaving only 100kDa or so) and do all the same strategy again on the subtracted particle stack.
Seconding all of @CryoEM2 's suggestions! In particular for others reading this thread, note that the detectability of a compositional variation comes down to how much mass “moves” between states - if it’s only 20aa that appears, that’s not much change. But if all the other atoms in the structure move (even very slightly) when the 20aa appears, that is much more detectable.
Fig 2: Real-Space Difference from Consensus with Focus Mask
Specially class 7. However, when I did Non-uniform refinement I didn’t have great improvement in the resolution of the density inside the mask. I think the problem is related with the final number of particles present in the reconstruction rather than the processing workflow. Probably there is still room for improvement and I will give a try to your suggestions.