Adjust per-particle defocus of subparticles in volume alignment tools?

Whoops! Sorry about that, forgot it’s a conditional group at the end there (“…so I used regex. Now I have two problems”). Replacing that line with

if recenter_regex.group(4) != "A":

ought to fix that issue at least. Still thinking about why it’s not working…

1 Like

Yes that fixes that issue, but main issue (resolution getting worse after adjusting defocus) still remains…

The weird thing is it works really well in one case (where the recentering vector is directly along z), but not in other cases - I wonder if that somehow has something to do with it…

1 Like

I think it might do; I had a case a while ago (tangentally defocus related) where modifications along Z were OK, offset was all messed up - I screwed up the ZYZ handling in my rotation calculations. :weary:

1 Like

I guess it depends on the format of alignments3D/pose … docs for the scipy transformation are here: from_rotvec — SciPy v1.14.0 Manual

This is also where in the docs for CS tools, it would be useful to have a detailed description of all the data structures in CS, in addition to a description of the modules - a description of e.g. how pose is specified would be helpful (maybe it is there already, I just couldn’t find it at a glance)

From here: It's time I admit I don't know what Rodrigues vectors are - #3 by DanielAsarnow

Pose is specified in axis-angle representation, which is what from_rotvec is expecting by default, and printing alignments3d/pose during the script looks consistent with that. What I don’t quite understand is whether something is going wrong when applying that transformation to the recentering vector - that is all I can think of as the issue, but can’t figure out why that would be the case…

1 Like

After a bit of time away from the problem, this:

# assume volume was well-centered
recenter_vector_A = map_size_A / 2 - new_center_A

# recenter_vector_A is the vector pointing from the old center to the
# new center in the reference pose. Now we rotate this vector for each
# recentered particle by applying the pose

is looking incorrect to me. This vector points from the new center to the old box center. @olibclarke could you try swapping the order of those operands, i.e.,

recenter_vector_A = new_center_A - map_size_A / 2

and testing that? Asking you to do it to avoid the confounding variable of my slapdash EMPIAR virus map being in the wrong hand or otherwise bad :sweat_smile:.

If this vector is pointing the wrong way, Z would be the correct length but flipped in the Z-only case (which worked), so you may need to try adding pose_center_vec_A to the defocus rather than subtracting it in the row iterations.

2 Likes

Hi Rich,

I tried that, but unfortunately it still makes things slightly worse than without adjustment… regardless of whether I add or subtract to adjust the defocus…

Oli

1 Like

Thanks for taking a swing at it for me. I’ll keep thinking!

1 Like

@olibclarke I had to think a little bit but I believe I have corrected for different micrograph and particle pixel sizes from cryosparc. I think the defocus transformation should be correct if you don’t use --inverty and do the re-extraction in Relion. In the cryosparc case the Y axis might not be right. Also it’s not strictly necessary to recenter if the subparticles don’t need a smaller box - if the same box is kept then all recentering does is make the particle origins normally distributed around the origin.

1 Like

@olibclarke Also the function was changing the defocus angle based on a very old misunderstanding I had at the beginning of grad school! Now it will only change the defocus values.

1 Like

Thanks Dan I will give this a try!! One thing - why is re-extraction in relion necessary in this case?

It shouldn’t be, I’m just sure it will behave as expected that way. The coordinate flip either on export or when importing back to CS might make the origin shift the wrong direction, I have to check/think about it.

1 Like

HI @rwaldo,

This script is still very handy! However, it fails using input from volume alignment tools as of CS v4.6.2, as VAT no longer has a particles.ctf slot in the output:

Traceback (most recent call last):
  File "adjust_defocus_subparticles.py", line 81, in <module>
    row["ctf/df1_A"] -= pose_recenter_vec_A[2]
  File "/home/user/.local/lib/python3.8/site-packages/cryosparc/row.py", line 36, in __getitem__
    return self.cols[key][self.idx]
KeyError: 'ctf/df1_A'

VAT output (v4.2):

VAT output (job cloned and run in 4.6.2):

By the way, did you ever figure out why it is working for translations along Z, but not other defocus adjustments?

Cheers
Oli

EDIT:

This error seems to be due to a way metadata is handled in v4.6.2. If I run local refine (4.6.2) → VAT (v4.6.2) → adjust defocus, everything is fine. If I run local refine (old, v4.2) → VAT (v4.6.2) I get the problem above. Looking at the local refine jobs:

Input from local refine (v4.6.2):

Input from local refine (v4.2):

So if an old local refinement job (or particle set in general) is passed to a new VAT job, the particles.ctf slot is lost. This also causes failure of subsequent reconstruction jobs:
image

Hi Oli!

This script is still very handy!

I’m happy to hear that!

However, it fails using input from volume alignment tools as of CS v4.6.2, as VAT no longer has a particles.ctf slot in the output […] This error seems to be due to a way metadata is handled in v4.6.2. If I run local refine (4.6.2) → VAT (v4.6.2) → adjust defocus, everything is fine. If I run local refine (old, v4.2) → VAT (v4.6.2) I get the problem above.

Hmm…I’m having trouble reproducing this. If I run a modern VAT job on particles from an old Homogeneous Refinement my outputs look as expected (like your first image). Could you make a new topic for this problem?

By the way, did you ever figure out why it is working for translations along Z, but not other defocus adjustments?

No, I never found a good dataset to test it on locally…but looking back at the code, could you try changing this line:
pose_recenter_vec_A = pose.apply(recenter_vector_A)
to
pose_recenter_vec_A = pose.inv().apply(recenter_vector_A)

As-is, I think I was moving the particle in the camera coordinates and then reading its shift in its local coordinates, when we want to do the opposite. Let me know if this change helps!

1 Like

Hi @rwaldo,

Hmm…I’m having trouble reproducing this. If I run a modern VAT job on particles from an old Homogeneous Refinement my outputs look as expected (like your first image). Could you make a new topic for this problem?

Split out as a separate thread here, thanks!

No, I never found a good dataset to test it on locally…but looking back at the code, could you try changing this line:
pose_recenter_vec_A = pose.apply(recenter_vector_A)
to
pose_recenter_vec_A = pose.inv().apply(recenter_vector_A)

This change unfortunately makes the resolution worse. I have a dataset I can provide for you to test with if useful? Where using the script to modify a defocus of a subparticle on the Z axis improves resolution substantially, but degrades resolution for a subparticle in the X-Y plane off-axis?

EDIT:

I added in some debugging and it seems like after combining a couple of the alterations in this thread, it is now giving max and min z-displacement values which match the measured distance in the structure . Running a test now on a case which should see improvement from this - will report back once the test is complete.

EDIT2:

That did the trick finally!! It is working now, even for off-axis subparticles! Improved resolution by 0.2Å for an off-axis subparticle! Here is the script I am using, thanks @rwaldo so much for your patience & perserverance!

from cryosparc.tools import CryoSPARC
import json
import numpy as np
from pathlib import Path
import re
from scipy.spatial.transform import Rotation as R

with open(Path('~/instance_info.json').expanduser(), 'r') as f:
    instance_info = json.load(f)

cs = CryoSPARC(**instance_info)
assert cs.test_connection()

project_number = "P35"
workspace_number = "W1"
vol_align_uid = "J1081"

project = cs.find_project(project_number)
job = project.find_job(vol_align_uid)
particles = job.load_output("particles")
volume = job.load_output("volume")

apix = volume["map/psize_A"][0]
print("apix=",apix)
map_size_px = volume["map/shape"][0]
print("map_size_px=",map_size_px)
map_size_A = map_size_px * apix

if "recenter_shift" in job.doc["params_spec"]:
    recenter_pattern = r"([\.\d]+), ?([\.\d]+), ?([\.\d]+) ?(px|A)?"
    recenter_string = job.doc["params_spec"]["recenter_shift"]["value"]
    print("recenter_string=",recenter_string)
    recenter_regex = re.match(recenter_pattern, recenter_string)
    print("len(recenter_regex.groups())=",len(recenter_regex.groups()))
    print("group 4=",recenter_regex.group(4))
    new_center_A = np.array([
        float(x) for x in [
            recenter_regex.group(1),
            recenter_regex.group(2),
            recenter_regex.group(3)
        ]
    ])
    if recenter_regex.group(4) != "A":
        new_center_A *= apix
        print("new_center_if=",new_center_A)
    print("new_center_A=",new_center_A)
else:
    # rather than re-calculate the center of mass, just get it from the job log
    # NOTE: cs.cli.get_job_streamlog() is hidden and experimental --
    #       it may change without warning
    vol_align_log = cs.cli.get_job_streamlog(project_number, vol_align_uid)
    centering_line = [x["text"] for x in vol_align_log if "New center will be" in x["text"]]
    print("centering_line=",centering_line)
    new_center_vox = re.search(r"([\.\d]+), ([\.\d]+), ([\.\d]+)", centering_line[0])
    new_center_vox = np.array([
        float(x) for x in [
            new_center_vox.group(1),
            new_center_vox.group(2),
            new_center_vox.group(3)
        ]
    ])
    new_center_A = new_center_vox * apix

# assume volume was well-centered
recenter_vector_A = new_center_A - map_size_A / 2
print("recenter_vector_A=",recenter_vector_A)

# recenter_vector_A is the vector pointing from the old center to the
# new center in the reference pose. Now we rotate this vector for each
# recentered particle by applying the pose

# iterating over rows is slow...

z_displacements = []

for row in particles.rows():
    pose = R.from_rotvec(row["alignments3D/pose"])
    pose_recenter_vec_A = pose.inv().apply(recenter_vector_A)
    row["ctf/df1_A"] -= pose_recenter_vec_A[2]
    row["ctf/df2_A"] -= pose_recenter_vec_A[2]
    z_displacements.append(pose_recenter_vec_A[2])

max_z = np.max(z_displacements)
min_z = np.min(z_displacements)
mean_z = np.mean(z_displacements)

print(f"Max Z displacement: {max_z:.2f} Å")
print(f"Min Z displacement: {min_z:.2f} Å")
print(f"Mean Z displacement: {mean_z:.2f} Å")


out_job = project.create_external_job(
    workspace_uid = workspace_number,
    title = "Defocus-adjusted particles"
)
out_job.add_input(
    type = "particle",
    name = "input_particles"
)
out_job.connect(
    target_input = "input_particles",
    source_job_uid = vol_align_uid,
    source_output = "particles"
)
out_job.add_output(
    type = "particle",
    name = "particles",
    passthrough = "input_particles",
    slots = ["ctf"],
    alloc = particles
)
out_job.save_output("particles", particles)
out_job.log("Job created with adjust-defocus.ipynb")
out_job.stop()

I have for now left a few print statements etc in there for debugging purposes. @rwaldo it would be super helpful if this could be incorporated as an option in VAT!

EDIT3:
… although now the modified script no longer works for subparticles from the same assembly located on the Z axis, which did work with the original script, so there is still something wrong. Hmm. So now I have one script that works perfectly for subparticles in the X-Y plane, and one that works perfectly for subparticles that are on the Z-axis, but nothing that works for both. Clearly missing something…

EDIT4:
Ah I may have figured out what was going on. The subparticle in the X-Y plane was generated from a volume that had a different handedness from the one that generated the subparticle on the Z-axis. Testing to confirm that is the issue now and will report back.

1 Like

Glad we got it working, thanks as much to your patience as mine! Depending on the result of the final test, you may need to add pose_recenter_vec_A instead of subtracting it, I didn’t pay too much attention to the sign of the defocus shifts when writing this.

1 Like

Ok confirmed, this works for both subparticles - just may need to flip the sign of the adjustment depending on whether the initial hand was correct (subtraction works for a volume with the correct hand):

from cryosparc.tools import CryoSPARC
import json
import numpy as np
from pathlib import Path
import re import os
from scipy.spatial.transform import Rotation as R

with open(Path('~/instance_info.json').expanduser(), 'r') as f:
    instance_info = json.load(f)

#Example instance_info.json in comment below;
#(license can be found in config.sh, within cryosparc2_master directory):
#{
#        "license": "xxx",
#        "email": "email@email.com",
#        "password": "password",
#        "base_port": 39000,
#        "host": "156.xxx.xxx.xxx"
#}

cs = CryoSPARC(**instance_info)
assert cs.test_connection()

project_number = "P35"
workspace_number = "W1"
vol_align_uid = "J1000"

project = cs.find_project(project_number)
job = project.find_job(vol_align_uid)
particles = job.load_output("particles")
volume = job.load_output("volume")

apix = volume["map/psize_A"][0]
print("apix=",apix)
map_size_px = volume["map/shape"][0]
print("map_size_px=",map_size_px)
map_size_A = map_size_px * apix

if "recenter_shift" in job.doc["params_spec"]:
    recenter_pattern = r"([\.\d]+), ?([\.\d]+), ?([\.\d]+) ?(px|A)?"
    recenter_string = job.doc["params_spec"]["recenter_shift"]["value"]
    print("recenter_string=",recenter_string)
    recenter_regex = re.match(recenter_pattern, recenter_string)
    print("len(recenter_regex.groups())=",len(recenter_regex.groups()))
    print("group 4=",recenter_regex.group(4))
    new_center_A = np.array([
        float(x) for x in [
            recenter_regex.group(1),
            recenter_regex.group(2),
            recenter_regex.group(3)
        ]
    ])
    if recenter_regex.group(4) != "A":
        new_center_A *= apix
        print("new_center_if=",new_center_A)
    print("new_center_A=",new_center_A)
else:
    # rather than re-calculate the center of mass, just get it from the job log
    # NOTE: cs.cli.get_job_streamlog() is hidden and experimental --
    #       it may change without warning
    vol_align_log = cs.cli.get_job_streamlog(project_number, vol_align_uid)
    centering_line = [x["text"] for x in vol_align_log if "New center will be" in x["text"]]
    print("centering_line=",centering_line)
    new_center_vox = re.search(r"([\.\d]+), ([\.\d]+), ([\.\d]+)", centering_line[0])
    new_center_vox = np.array([
        float(x) for x in [
            new_center_vox.group(1),
            new_center_vox.group(2),
            new_center_vox.group(3)
        ]
    ])
    new_center_A = new_center_vox * apix

# assume volume was well-centered
#recenter_vector_A = new_center_A - map_size_A / 2
recenter_vector_A = map_size_A / 2 - new_center_A

print("recenter_vector_A=",recenter_vector_A)

# recenter_vector_A is the vector pointing from the old center to the
# new center in the reference pose. Now we rotate this vector for each
# recentered particle by applying the pose

# iterating over rows is slow...

z_displacements = []

for row in particles.rows():
    pose = R.from_rotvec(row["alignments3D/pose"])
    #pose_recenter_vec_A = pose.apply(recenter_vector_A)
    pose_recenter_vec_A = pose.inv().apply(recenter_vector_A)
    row["ctf/df1_A"] -= pose_recenter_vec_A[2]
    row["ctf/df2_A"] -= pose_recenter_vec_A[2]
    z_displacements.append(pose_recenter_vec_A[2])

max_z = np.max(z_displacements)
min_z = np.min(z_displacements)
mean_z = np.mean(z_displacements)

print(f"Max Z displacement: {max_z:.2f} Å")
print(f"Min Z displacement: {min_z:.2f} Å")
print(f"Mean Z displacement: {mean_z:.2f} Å")


out_job = project.create_external_job(
    workspace_uid = workspace_number,
    title = "Defocus-adjusted particles"
)
out_job.add_input(
    type = "particle",
    name = "input_particles"
)
out_job.connect(
    target_input = "input_particles",
    source_job_uid = vol_align_uid,
    source_output = "particles"
)
out_job.add_output(
    type = "particle",
    name = "particles",
    passthrough = "input_particles",
    slots = ["ctf"],
    alloc = particles
)
out_job.save_output("particles", particles)
full_path = os.path.abspath(__file__)
out_job.log(f"Job created using: {full_path}")
with out_job.run():
    out_job.log("Trying to log script now...")
    with open(__file__, "r") as script:
        out_job.log("".join(script))
out_job.stop()
2 Likes

Hi @rwaldo, we used this script here:

And it really helped! It improved resolution of the cap subparticle from 2.5 Å (the best we could to with per-subparticle defocus refinement), to 2.25 Å (with per-subparticle defocus adjustment using this script).

It would be great to have this integrated into volume alignment tools, to facilitate more streamlined block-based reconstruction approaches! :pray:

Also, for scripts like this, it would be good to have a permanent, citable home - e.g. github or zenodo, or perhaps a custom solution - to allow citing & reuse of scripts, with proper versioning?

Cheers
Oli

2 Likes

Just revisiting this - I am still a bit puzzled. I have a single script that works, and I have it set up to make two jobs - one with a positive and one with a negative adjustment to defocus:

from pathlib import Path
import json
import re
import argparse
import numpy as np
from scipy.spatial.transform import Rotation as R
from cryosparc.tools import CryoSPARC

# Parse command-line arguments
parser = argparse.ArgumentParser(
    description="Apply analytic defocus adjustments to CryoSPARC particle sets"
)
parser.add_argument(
    "--project", "-p",
    required=True,
    help="CryoSPARC project UID (e.g., P10)"
)
parser.add_argument(
    "--workspace", "-w",
    required=True,
    help="CryoSPARC workspace UID for new jobs (e.g., W3)"
)
parser.add_argument(
    "--job", "-j",
    required=True,
    help="CryoSPARC job UID for the volume alignment job (e.g., J244)"
)
parser.add_argument(
    "--instance-info", "-i",
    default=str(Path.home() / 'instance_info.json'),
    help="Path to CryoSPARC instance_info.json"
)
args = parser.parse_args()

# Load connection info
info_path = Path(args.instance_info)
with open(info_path, 'r') as f:
    instance_info = json.load(f)

# Connect to CryoSPARC
cs = CryoSPARC(**instance_info)
if not cs.test_connection():
    raise RuntimeError("Could not connect to CryoSPARC server")

# UIDs from arguments
PROJECT_UID = args.project
WORKSPACE_UID = args.workspace
VOL_ALIGN_UID = args.job

# Load project, job, volume, and particles
project = cs.find_project(PROJECT_UID)
job = project.find_job(VOL_ALIGN_UID)
volume = job.load_output("volume")
particles = job.load_output("particles")

# Map center (Å)
apix = volume['map/psize_A'][0]
map_size_px = volume['map/shape'][0]
map_center = (map_size_px * apix) / 2

# Determine new center shift (Å)
if 'recenter_shift' in job.doc.get('params_spec', {}):
    s = job.doc['params_spec']['recenter_shift']['value']
    m = re.match(r"([\d.]+),\s*([\d.]+),\s*([\d.]+)\s*(px|A)?", s)
    if not m:
        raise ValueError(f"Cannot parse recenter_shift: {s}")
    center_vals = np.array([float(m.group(i)) for i in (1,2,3)])
    new_center_A = center_vals * (apix if m.group(4) != 'A' else 1)
    print("Center calculted from params.")
else:
    log_entries = cs.cli.get_job_streamlog(PROJECT_UID, VOL_ALIGN_UID)
    text = next(x['text'] for x in log_entries if 'New center' in x['text'])
    m = re.search(r"([\d.]+),\s*([\d.]+),\s*([\d.]+)", text)
    new_center_A = np.array([float(m.group(i)) for i in (1,2,3)]) * apix
    print("Center extracted from log.")

delta = new_center_A - map_center

# Compute per-particle defocus shift
dz_vals = []
for row in particles.rows():
    Rmat = R.from_rotvec(row['alignments3D/pose']).as_matrix()
    beam_dir = Rmat[:, 2]
    dz_vals.append(float(np.dot(beam_dir, delta)))

dz_array = np.array(dz_vals)
mean_dz = dz_array.mean()
std_dz = dz_array.std()

# Create adjustment jobs
for sign, label in [(1, 'plus'), (-1, 'minus')]:
    parts = job.load_output('particles')
    for row, dz in zip(parts.rows(), dz_array):
        row['ctf/df1_A'] -= sign * dz
        row['ctf/df2_A'] -= sign * dz

    title = f"Defocus-adjusted_{label}"
    out_job = project.create_external_job(
        workspace_uid=WORKSPACE_UID,
        title=title
    )
    out_job.add_input(type='particle', name='input_particles')
    out_job.connect(
        target_input='input_particles',
        source_job_uid=VOL_ALIGN_UID,
        source_output='particles'
    )
    out_job.add_output(
        type='particle', name='particles', passthrough='input_particles',
        slots=['ctf'], alloc=parts
    )
    out_job.save_output('particles', parts)
    out_job.log(f"Applied analytic Δz {label}: mean={mean_dz * sign:.2f} Å, std={std_dz:.2f} Å")
    out_job.stop()
    print(f"Created job '{title}'")

print("All adjustment jobs created.")

I expected that flipping the sign would only be necessary if the handedness of the reconstruction was inverted - but it doesn’t seem to be the case.

For example, for the vault particle, a negative adjustment is required for subparticles at the ends of the vault in order to improve resolution, while a positive adjustment is required for subparticles on the side of the vault (displaced from the Cn axis).

Similarly a positive adjustment is required for adjusting the defocus of subparticles at the corner of the ryanodine receptor. So I feel there must be a bug here somewhere, but I can’t see it for the life of me - the math should be the same no matter what particle/symmetry one is dealing with, and it certainly should be the same for different subparticles in the same assembly…

In practice maybe this doesn’t matter - I try both and see which one gives a resolution improvement after reconstruction - but I worry that perhaps it is applying the correct adjustments for some subparticles in a set, but not for others. Perhaps something to do with how the poses of symmetry expanded particles are specified?

Vault cap:

RyR corner:

Cheers
Oli

Hey @olibclarke,

This is a great observation, I believe it is related to the Ewald Sphere sign in the dataset! CryoSPARC treats particle image data such that the image lies in the xy plane and the +z direction (using a right handed coord system) points upwards towards the electron source. But since it’s possible that the particle images have been reflected relative to their physical orientation (presumably at some point, during imaging), it’s possible that the +z direction actually points away from the electron source, in which case a negative ewald sign is needed to correct this.

So anytime the global z coordinate in the frame of the microscope is needed, the sign ambiguity needs to be considered, same as in the Ewald Case. If you have time to spare, could you try the following for each dataset:

  • Run Homo reconstruct with ewald sphere correction enabled, once with positive and once negative ewald sphere sign (without adjusting defocus for now), and can you compare this to Homo reconstruct with no ewald sphere correction applied? (if possible using the same mask on all three jobs)
  • If a resolution improvement or worsening is observed in the negative or positive case relative to the uncorrected case:
    • Does the sign of the defocus correction that worsens results agree with the sign of the ewald sphere correction that worsens resolution, or do they disagree? Is the pattern consistent between both particle datasets
    • Likewise for the sign of the defocus correction that improves results?

These would be super helpful to know, as it would affirm if this is indeed the same discrepancy as ewald sign ambiguity.

Thanks and best regards,
Michael

Hi @mmclean - I will run these tests!

I had also wondered about this possibility, however the thing that I find puzzling is that for the vault, the sign change is different within the same dataset - the subparticles on the particle Z-axis (the caps) require a different sign change from the subparticles on the vault walls (displaced orthogonal to the Cn axis).

This doesn’t at first seem consistent with the Ewald sign hypothesis, unless CS is treating D39 symmetry expanded particles (as is the case for the “waist” subparticles) differently somehow? In any case this is helpful - I will run some tests and see what we see!

Cheers
Oli

EDIT:

Here is what I mean - same dataset, different subparticle, sign change required is different. Should the Ewald ambiguity be different for different subparticles? If so where does the additional reflection during image processing come from?