I’m trying to do 2D class average for a complex I’m working on. The 2D class average looks weird even I played the parameters of different masks, initial classification uncertain factors, batch size and number of iterations.
The 2D class average is attached. Anyone can please give me some suggestions? I’m thinking if the raw sample is not good or is it because of my way to deal with this data is wrong.
Hi, and welcome to the forum! Could you provide some more details to help diagnose the issue? Information like the size of the particle, the pixel size, and the extraction box size would be helpful.
What’s the mass of the particle? Can you clearly see particles in your micrographs, and are they being picked well?
If the particle is on the smaller side (<100 kDa) or has low SNR, then turning force max poses over shifts off might be helpful. You should set around 80 online-EM iterations, because the job will take a lot longer to converge. This post has some helpful information on what parameters to choose:
Thank you very much cbeck! The tips are quite useful and now I can get reasonable 2D class average. But I met another issue again. I can produce good initial model but when I try to do the hetergenous refinement, the results are not good. I’m guessing it’s because of the low SNR. Do you have any suggestions?
The micrographs looks great so that’s the reason I didn’t understand why the 2D class average looks weird. I thought it probably has the air-water-interface issue, but now it’s the processing issue. Thank you very much~
I’m glad to hear that the 2D classification improved and that you can get a good ab inito! Can you expand on what you mean by the results of heterogeneous refinement not being good? How many classes are you using, and what are you using for the reference volumes? Is heterogeneous refinement not pulling out junk particles effectively? You might want to refer to this tutorial page written by user olibclarke on how to set up decoy classification with heterogeneous refinement: Case Study: Exploratory data processing by Oliver Clarke | CryoSPARC Guide
To summarize, set up a heterogeneous refinement job with your good ab initio and 3-8 “decoy” classes. The decoy classes are meant to absorb junk particles. My preferred way to generate decoy classes is to start an ab initio job with ~12 classes, then kill the job as soon as the first iteration finishes, which results in 12 classes of random noise. If you mark the job as complete, then you can use these classes as decoys for heterogeneous refinement.
Once the heterogeneous refinement finishes, take only the particles that went into the good class and do another round of decoy classification to pull out more junk. I repeat this process until the junk particles make up less than <5% of the total particle stack. I then perform another round of 2D classification, because now that you’ve removed most of the junk, you’ll get more diverse classes of your particle and potentially identify rare views.
The above strategy I described is the “basic” version. I sometimes make the following modifications:
If the “good” class after a round of heterogeneous refinement starts to look noisy or has overfitting artifacts, it can help to redo ab initio on this cleaner stack of particles to generate a better reference
Even though the decoy reference are mostly random noise, I’ve noticed that a given decoy reference tends to produce similar-looking volumes at the end of the refinement. To pull out as much diverse junk as possible, I usually rotate through different decoy references with each iteration of heterogeneous refinement
If you want to be really gentle with pulling out junk (e.g. if you’re worried about throwing away rare views in the junk classes), you can set up multiple clones of the same heterogeneous refinement job in parallel (one for each GPU on your workstation). After they finish, you can pool all of the particles that went into at least one of the good classes and use them for the next iteration of heterogeneous refinement. This method is more conservative and will only throw out a particle if it was assigned to a junk class multiple times. However, using multiple GPUs simultaneously at once tends to significantly slow down each individual job.
Edit to add: Are you binning (Fourier cropping) the particle images? Especially for this particle cleanup stage where you don’t need high-resolution data, it can help to bin the images by a factor of two or more. For example, for a box size of 360, I usually bin to 120, which results in a Nyquist frequency of ~6 A for my micrograph’s pixel size. This significantly speeds up processing time and helps when experimenting with different parameters.