Adjust per-particle defocus of subparticles in volume alignment tools?

olibclarke · August 6, 2024, 6:41pm

That is great Rich I will try that! Is there a CS-tools way to modify the per particle defocus values by adjusting for the Z-displacement of the sub-particle center?

rposert · August 6, 2024, 6:51pm

Probably! I haven’t done much (read: any) work with subparticle extraction, though, so I’ll have to ask a few simple questions. Could you give me a brief overview of the process? When you extract the particle, are you re-centering on a mask? Or are you actually using scipion as in your linked paper, and if so, what type/format of data does it make available?

olibclarke · August 6, 2024, 7:04pm

Ah yes of course!

So basically the workflow is, after a global refinement, to first perform symmetry expansion, then use volume alignment tools to recenter on the desired position (of e.g. a domain or a protomer). No need for any external tools; All this is done in cryosparc.

Then one can use the output of volume alignment tools (with recentering using aligned shifts) to extract subparticles in a smaller box size around each symmetry-related position, and then one can proceed with local refinement or local classification (sometimes with local symmetry) with a mask around the subparticle. This can be helpful to improve local resolution by dealing with the mobility of the subparticle, and to untangle local symmetry/pseudosymmetry and/or heterogeneity.

So far so good. But because, in the case of something like a virus, the subparticle is actually quite far from the center of the particle, I would like to adjust the defocus of each subparticle by the Z-displacement component of the aligned shifts. This can make quite a difference to resolution for subparticles in large assemblies at high resolution. Does this make sense?

So basically I need a way, using CS-tools, to read in the symmetry expanded particle set after vol alignment tools (and I guess maybe the particle set before recentering?), and calculate the signed difference in the height (Z-displacement) of each subparticle, then subtract that from the original defocus values.

This can make quite a difference - e.g. see supp 2 of the localrec paper linked in the first post:

…so it would be great to have a way to do it within CS, particularly as more and more often we are reaching resolutions or processing assemblies of a size where it can make a big difference

rposert · August 6, 2024, 9:45pm

Thanks for those details, Oli! I’ll take a look into this (no guarantee I’ll be able to come up with something that works though).

My general plan is to calculate the center of mass (in the reference pose) of the mask used during recentering, then rotate that vector into each subparticle’s pose (by applying the pose). We should then be able to use that rotated vector’s Z component to pull out the necessary defocus change.

I’ll keep you posted!

olibclarke · August 6, 2024, 11:53pm

Sounds good! In this case I wasn’t using a mask for recentering - just directly specifying the coordinates on the reference volume - but either way works

I have a couple of cases where I can easily test if it works (where I know defocus gradient is a limiting factor and local CTF refinement of the subparticle improves matters)

rbs_sci · August 7, 2024, 1:11am

Haha, thanks - it’s funny, one of the attractive points of CryoSPARC was that it didn’t need the “scary command line” (to paraphrase several trainees we’ve had) but now several things which were/are fairly easy to do with a little bit of work using standard Linux system utilities with RELION need Python to do in CryoSPARC… and people who complain about the command line also tend to complain about Python… …and people who don’t complain about the command line do complain about Python.

rposert · August 7, 2024, 2:16pm

[off topic]

Yes, I know python’s a sticking point for sure, and I totally understand having been there myself! I think the idea with cs-tools is to let “power users” work with features the minute they think of them, but unfortunately the Venn diagram of advanced cryoEM practitioners and python programmers is not a circle!

On my list is a guide page along the lines of “Python for cryoEM”. If you have collected specific sticking points from trainees (i.e., beyond “the command line is frightening”, which I don’t in any way mean to downplay) I’d be very interested in hearing about them (although perhaps in their own forum topic )!

olibclarke · August 7, 2024, 2:56pm

I think there are a couple of issues that inhibit adoption a bit:

CS-tools is incredibly powerful, but it isn’t integrated into the GUI really, and requires some separate installation steps. If I was able to save scripts and then have folks use them in a click-button manner from within the GUI, perhaps as a blueprint, I think that would reduce the activation energy a lot for folks who are not so used to python scripting.
I think the python for cryoEM idea is a great one! I would also collect & document some of the great scripts you’ve already posted on the forum, and add them to the examples
For convenience, it would be useful to have a simple cryosparc command line shortcut for users to create or print the “instance-info.json” file - if one doesn’t use cs-tools frequently, it always takes a minute to find and format this information correctly.

rposert · August 7, 2024, 3:17pm

EDIT: just realized we actually want to apply the pose without reversing it! the script has been edited to reflect this

Hi @olibclarke! Below is a script which I think will properly adjust the defocus parameters. I was going to test it, but since I’m not 100% sure I did the other steps (which are largely unrelated to the script) right I figured I’d just let you be the guinea pig

Beyond the usual cs-tools stuff, you’ll need scipy. You’ll also have to provide this script with the Volume Alignment Tools job you used to recenter the volume, then use the low-level interface to replace the CTF of the sub-particles. I did it this way so that you could run this while the extract job is running, rather than having to wait, and so that there is a lower possibility of user error when entering the re-centering coordinates.

Please let me know if that works, or if it seems backwards or messed up in some other way!

from cryosparc.tools import CryoSPARC
import json
import numpy as np
from pathlib import Path
import re
from scipy.spatial.transform import Rotation as R

with open(Path('~/instance-info.json').expanduser(), 'r') as f:
    instance_info = json.load(f)

cs = CryoSPARC(**instance_info)
assert cs.test_connection()

project_number = "P345"
workspace_number = "W3"
vol_align_uid = "J1175"

project = cs.find_project(project_number)
job = project.find_job(vol_align_uid)
particles = job.load_output("particles")
volume = job.load_output("volume")

apix = volume["map/psize_A"][0]
map_size_px = volume["map/shape"][0]
map_size_A = map_size_px * apix

recenter_pattern = r"([\.\d]+), ?([\.\d]+), ?([\.\d]+) ?(px|A)?"
recenter_string = job.doc["params_spec"]["recenter_shift"]["value"]
recenter_regex = re.match(recenter_pattern, recenter_string)

new_center_A = np.array([
    float(x) for x in [
        recenter_regex.group(1),
        recenter_regex.group(2),
        recenter_regex.group(3)
    ]
])
if len(recenter_regex.groups()) == 3 or recenter_regex.group(4) == "px":
    new_center_A *= apix

# assume volume was well-centered
recenter_vector_A = map_size_A / 2 - new_center_A

# recenter_vector_A is the vector pointing from the old center to the
# new center in the reference pose. Now we rotate this vector for each
# recentered particle by applying the pose

# iterating over rows is slow...
for row in particles.rows():
    pose = R.from_rotvec(row["alignments3D/pose"])
    pose_recenter_vec_A = pose.apply(recenter_vector_A)
    row["ctf/df1_A"] -= pose_recenter_vec_A[2]
    row["ctf/df2_A"] -= pose_recenter_vec_A[2]

out_job = project.create_external_job(
    workspace_uid = workspace_number,
    title = "Defocus-adjusted particles"
)
out_job.add_input(
    type = "particle",
    name = "input_particles"
)
out_job.connect(
    target_input = "input_particles",
    source_job_uid = vol_align_uid,
    source_output = "particles"
)
out_job.add_output(
    type = "particle",
    name = "particles",
    passthrough = "input_particles",
    slots = ["ctf"],
    alloc = particles
)
out_job.save_output("particles", particles)
out_job.log("Job created with adjust-defocus.ipynb")
out_job.stop()

olibclarke · August 7, 2024, 5:27pm

Hi @rposert,

Thanks so much!! Tested this and it works like a charm!! Resolution improved for the subparticle reconstruction from 3.14 Å without adjusting defocus, to 2.86 Å after applying adjust_defocus.py, which is better than using local CTF refinement with a subparticle mask!

Would be great to have this incorporated directly as an option in Vol Alignment Tools, I think it would get a lot of use in advanced workflows…

Cheers
Oli

rposert · August 7, 2024, 5:38pm

Great! Glad it worked! I’ve recorded your feature request to incorporate this into VAT, too.

olibclarke · August 7, 2024, 5:53pm

Semi-unrelated, but pertaining to one of the earlier posts in the thread - in cases like this:

It would be useful to have an option in local CTF refinement to leave the defocus parameters of particles that “fail” the search (hit the search boundaries without finding a clear minimum) unchanged, rather than setting them to the maximum value.

Perhaps this should even be the default, if no clear minimum as found? In the case above, the local CTF refinement still improves the resolution of the reconstruction, but I suspect this is due to the adjustments in the bimodal, central parts of the distribution - I would like to avoid the outliers changing defocus by physically unreasonable values…

Thanks again for the script @rposert, it will be immensely helpful!

Cheers
Oli

olibclarke · August 7, 2024, 9:23pm

Hi @rposert, just some more testing on this script - it works if I directly specify the center in volume alignment tools, but if I use the option in VAT to recenter on the mask center of mass, it fails with this error:

Traceback (most recent call last):
  File "adjust_defocus_subparticles.py", line 28, in <module>
    recenter_string = job.doc["params_spec"]["recenter_shift"]["value"]
KeyError: 'recenter_shift'

rposert · August 7, 2024, 10:12pm

Yes, I directly read the recenter string there, so if you use a mask it won’t know the new center. Replacing that whole bit with the below if/else it should work. Let me know!

import re
if "recenter_shift" in job.doc["params_spec"]:
    recenter_pattern = r"([\.\d]+), ?([\.\d]+), ?([\.\d]+) ?(px|A)?"
    recenter_string = job.doc["params_spec"]["recenter_shift"]["value"]
    recenter_regex = re.match(recenter_pattern, recenter_string)
    new_center_A = np.array([
        float(x) for x in [
            recenter_regex.group(1),
            recenter_regex.group(2),
            recenter_regex.group(3)
        ]
    ])
    if len(recenter_regex.groups()) == 3 or recenter_regex.group(4) == "px":
        new_center_A *= apix
else:
    # rather than re-calculate the center of mass, just get it from the job log
    # NOTE: cs.cli.get_job_streamlog() is hidden and experimental --
    #       it may change without warning
    vol_align_log = cs.cli.get_job_streamlog(project_number, vol_align_uid)
    centering_line = [x["text"] for x in vol_align_log if "New center will be" in x["text"]]
    new_center_vox = re.search(r"([\.\d]+), ([\.\d]+), ([\.\d]+)", centering_line[0])
    new_center_vox = np.array([
        float(x) for x in [
            new_center_vox.group(1),
            new_center_vox.group(2),
            new_center_vox.group(3)
        ]
    ])
    new_center_A = new_center_vox * apix

olibclarke · August 8, 2024, 2:56am

Thanks Rich this does the trick for recentering with a mask!

However, I’ve now tried the script on two samples, both high resolution cases where I know the hand is correct. In the first case, adjusting the defocus worked, improving resolution considerably.

In the second case, the resolution actually got worse than without adjustment, which was surprising. I wonder if there is a bug somewhere? In the case that worked, the subparticle was displaced along the z-axis of the volume, while in the case that didn’t, it was displaced off axis (with little displacement in z).

What we want to do (I think) is to subtract the Z component of the recentering vector from the defocus, where Z is the axis corresponding to the projection direction of the particle (not neccessarily the z axis of the volume). Is that what is happening in this part?

for row in particles.rows():
    pose = R.from_rotvec(row["alignments3D/pose"])
    pose_recenter_vec_A = pose.apply(recenter_vector_A)
    row["ctf/df1_A"] -= pose_recenter_vec_A[2]
    row["ctf/df2_A"] -= pose_recenter_vec_A[2]

Is the third ([2]) element of pose_recenter_vec_A the component along the projection direction? I guess this will be the case after pose.apply? Apologies for the likely silly questions!

EDIT:

Or maybe it is an issue of the particle source - these particles were polished & extracted in Relion - while the volume is the correct (biological) hand, perhaps if micrographs were flipped/mirrored for extraction that would cause inversion of the “true” hand to be considered when adjusting defocus…? Will try with the inverted hand…

rposert · August 8, 2024, 3:22pm

Yes, what we’re doing in that chunk is rotating the recentering operation from the reference frame to the particle’s frame, then subtracting the Z component of the rotated recentering vector from the defocus. I think this should do what we want – if the opposite hand doesn’t work right, please let me know and I’ll think more about it!

olibclarke · August 8, 2024, 4:02pm

Opposite hand doesn’t work either - strange. I will try in a couple more test cases and see if I can figure it out, thanks!

rposert · August 8, 2024, 4:09pm

My test case didn’t improve in either hand either. I assumed I’d done something wrong, but it’s very possible there’s a flaw in my reasoning. I’ll think more about it!

olibclarke · August 8, 2024, 5:17pm

I tried on a second case and same thing - both the “correct” and inverse adjustments made resolution worse (where I would expect one to make it better and one to make it worse, compared to no adjustment).

In the case that worked, the subparticle transformation corresponded to a direct translation along the z-axis - could that somehow make the difference?

olibclarke · August 8, 2024, 6:01pm

So I’m not sure if this is the issue but it is an issue:

Here, if in VAT one doesn’t specify a unit (defaults to px), this loop is never run - because the length of recenter_regex.groups() is 4 regardless whether the unit is explicitly specified. If a unit is not specified, it has the format ('x', 'y', 'z', None). The consequence of this is that if the initial center is entered without units, it remains in pixels - it is never converted to Å.