I often find that grouping micrographs into image shift groups and refining per-group CTF params improve resolution, when there is a significant optical aberration present in the dataset. We have had cases that improved the resolution from 3.4 Å to 2.8 Å & from 3 Å to 2.5 Å.
NU-Refinement; Iterative optimizations for per-particle defocus and per-group CTF parameters were ON. The masked FSC looks comparable to the unmasked FSC:
Same NU-Refinement job after grouping the particles by image shift groups:
I use K-means clustering in scikit-learn to group micrographs by similar image shift X & Ys. Then I edit the particles.star to run CTFRefine in RELION. The downside of this is that the refined CTF params do not carry over to cryoSPARC when you import back the particles, and I am stuck with RELION for further processing.
I wanted to do the same in cryoSPARC, and it involves:
Running a python script for K-Means clustering (kmeans_groups.py)
Adding class identifier numbers to the filenames of symlinked micrographs, then importing the micrographs with new names (add_class.sh)
Reassigning particles to the imported micrographs (without re-extracting particles)
Running Exposure Group Utilities to split particles by their location/micrograph_path
Building from the initial k-means script that Bill Rice at NYU kindly provided, I wrote python and bash shell scripts for steps 1 & 2 (GitHub - kookjookeem/kmeans-beamtilt), and the page describing the steps can be found here.
I hope you find these scripts useful! Please try and let me know if you have any questions.
This works great! I tried it with one of my datasets and the resolution improved from 3.1 Å to 2.8 Å. Thanks for sharing the scripts.
There is a small error on your instruction page:
When removing the UIDs, I think you meant “${file:22}”.
For the add_class script, when I just run it as it is, it shows an “ambiguous redirect” error. I changed the csvfile=“” to csvfile=“km_groups_01.csv” for it to work.
Also, because the first line of the input csv file is “name,class”, when the add_class script runs, it will show the output “name does not exist in mics”. I made a small change to make it skip the first line and the output would be cleaner:
{
read
while…
That’s great that my script helped. Although it is distributed with Leginon, I realized my Tiltgroup_wrangler script is in a bit of an obscure location. Here is the GitHub link for the program and instructions:
You just need to download the CTF information from the Leginon website, and load the cryoSPARC particle set and passthrough particles .cs files. It outputs a new .cs file, and you can then replace either the particle set or the passthrough file with this file. If you then re-refine, cryoSPARC will divide the set into the number of groups specified. I recently updated it to be compatible with cryoSPARC 4.
Hope you and other Leginon users find this useful.
I had no idea this existed and it looks super helpful, thanks Bill - we used your kmeans clustering script (thanks!), but I didn’t realize that tiltgroup_wrangler existed until now! (I also didn’t realize until now that Appion was able to directly output beam tilt groups!)
Re the built-in clustering in Leginon, is that using kmeans as well? Often we find we need to do a bit of tweaking to the raw output of kmeans - sometimes two clusters will be merged in one, or one cluster split in two, and having graphical feedback on the cluster center locations is handy for this purpose to make sure everything looks good.
Hi Oli,
The website version uses the same kmeans algorithm so it will give similar results. There is no plot but you get the clustering directly. I usually target in such a way that there are no obvious clusters by eye, so I just choose a large number like 50-100.