Experimental support for Relion's Bayesian polishing in csparc2star.py

My code comment is wrong actually - it is cryoSPARC that doesn’t flip the TIFs.

I think that differences in local motion patch interpolation between different motion correction is somehow scrambling the relationship with the refined particle shifts. So far I’ve only got a resolution enhancement from BPP if I re-extract and re-refine the particles based on the Relion or MotionCor2 motion correction. Going directly from csparc should work in principle but it seems something isn’t right. I’m reasonably sure the conversion convention I’m using is right, but none of the other options worked either.

Edit to clarify:
I’ve done Scenario 1 and 3 successfully, 3 also works if you add a re-extract and re-refine from the Relion MotionCorr job.

2 Likes

This makes sense. Insightful point. Thanks for clarifying your own experience as well.

Cheers,
Yang

Hi @DanielAsarnow

Wondering if there’s a quick way of getting around an apparent bug in the CS MC2 wrapper.

Looking in the .cs file, while 'movie_blob/psize_A' reflects 0.83, 'rigid_motion/psize_A' appears to be truncated to 0.. At least I think that’s why csparc2star.py --movie is returning divide_by_zero errors. micrograph_blob/vmin and micrograph_blob/vmax seem to be similarly truncated, so I think it’s a general parsing issue.

/home/ylee/software/pyem/pyem/metadata/cryosparc2.py:263: RuntimeWarning: divide by zero encountered in true_divide
  traj /= cs['rigid_motion/psize_A'][i]  # Looks right.
/home/ylee/software/pyem/pyem/metadata/cryosparc2.py:263: RuntimeWarning: invalid value encountered in true_divide
  traj /= cs['rigid_motion/psize_A'][i]  # Looks right.

Do you normally run MC2 outside of CS? The wrapper doesn’t support the -LogDir option in ≥1.4.6 leaving no convenient way of issuing -OutStar 1 from within CS.

EDIT:
Never mind. Easier to edit the .cs file itself.

EDIT2:
I see. csparc2star.py --movie attempts to correct for the axis inversion, which isn’t appropriate for MC2 inputs.

Cheers,
Yang

Hi @DanielAsarnow

Some further investigations. I repeated some tests with EMPIAR-11350 to see where things may be failing and which workflows are possible.

Scenario 1: RELION to CS to RELION (baseline)
  1. tif movies motion-corrected with RelionCor.
    → rlnMicrographs (blob)-picked and processed in CS to obtain particle stack (3.1-3.2Å).
    csparc2star.py --inverty particle conversion for polishing with RelionCor outputs.
    half_maps and mask_fsc_auto used as is for postprocessing.
    shiny particles yield better results (~2.8-2.9Å; better FSC at medium resolution).
Scenario 2: MC2(wrapper)/CS to RELION (sanity check)
  1. tif movies motion-corrected with MotionCor2 within CS.
    → pre-shiny particle coordinates from S1 used to reextract from mc2Micrographs (Recenter using aligned shifts disabled).
    → Particles refined against original reference (3.1-3.2Å).
    csparc2star.py --movie to generate mc2Micrograph and motion star files*.
    cpsarc2star.py --inverty particle conversion for polishing.
    half_maps and mask_fsc_auto used as is for postprocessing.
    shiny particles yield better results (~2.8-2.9Å; better FSC at medium resolution).

*polishing attempted with positive and negative orientations of _rlnMicrographShiftX, both giving the same outcome, i.e. modelled particle trajectories and refinement FSC. Result seems insensitive to global_shift priors when 30-40 particles/mic to track?

Scenario 3: CS to RELION
  1. a) tif movies Patch Motion-corrected in CS.
    → pre-shiny particle coordinates from S1 used to extract from csMicrographs (Flip mic in Y before extract enabled; Recenter using aligned shifts disabled).
    → Particles refined against original reference (3.1-3.2Å).
    csparc2star.py --movie to generate csMicrograph and motion star files.
    cpsarc2star.py --inverty particle conversion for polishing.
    half_maps and mask_fsc_auto used as is for postprocessing.
    shiny particles yield better results (~2.8-2.9Å; better FSC at medium resolution; maybe slightly worse than S1 and S2, which may or may not be stochastic, but at least it didn’t fail).
  • b) tif movies Patch Motion-corrected in CS.
    → pre-shiny particle coordinates from S1 Y-inverted and used to extract from csMicrographs (Recenter using aligned shifts disabled).
    → Particles refined against original reference (3.1-3.2Å).
    csparc2star.py --movie to generate csMicrograph and motion star files.
    cpsarc2star.py --flipy particle conversion for polishing.
    half_maps and mask_fsc_auto ±(flipZ+Y) for postprocessing.
    shiny particles, both ±(flipZ+Y) polish_reference, yield worse results (~4.5Å).

At least with this dataset, BPP seems largely insensitive to motion initialization inputs (see S2).

S3a gave an improvement from BPP, suggesting that Patch Motion in CS isn’t inherently the primary determinant when things blow up.

However, the differing S3a/b outcomes, depending namely on whether the mics are pre-flipped (a) or the coordinates/transformations are flipped later (b), suggests something isn’t quite right either with the way Relion is perceiving the --flipy-transformed particles, and/or the treatment of the reference?

Cheers,
Yang

@leetleyang Thanks, this is great. Can you test 3b again with line 95 of csparc2star.py commented out? Like that:

95: # df[star.Relion.ORIGINY] = -df[star.Relion.ORIGINY]

Hi @DanielAsarnow

Sure, I’ll give that a go.

In the mean time, I’ve tried to dig a bit into the discrepancy. I took the pre-shiny particle images from S3b, flipped them in Y (relion_image_handler --flipY), replaced the original stacks and refined against the original reference, in order to simulate the necessary shifts/transformations required if particles were sourced from Y-flipped mics.

Averaged across a few thousand particle images, the differences in rlnAngleRot, rlnAngleTilt, rlnAnglePsi, rlnOriginXAngst and rlnOriginYAngst, compared to right-side up particle images seemed best described by…

  • phi+180, 180-theta, -psi, XAngst (as is), -YAngst

Note, it’s similar to the transformation suggested by @sunch here, sans Y-shift inversion. That additional element is incorporated in --flipy. That one difference is perhaps why @olibclarke didn’t quite have success with just the Euler angle modifications at the time.

Scenario 3: CS to RELION
  1. c) tif movies Patch Motion-corrected in CS.
    → pre-shiny particle coordinates from S1 Y-inverted and used to extract from csMicrographs (Recenter using aligned shifts disabled).
    → Particles refined against original reference (3.1-3.2Å).
    csparc2star.py --movie to generate csMicrograph and motion star files.
    csparc2star.py particle conversion for polishing.
    application of the above .star file modifications using awk.
    half_maps and mask_fsc_auto used as is for postprocessing.
    shiny particles now yield comparable, if not identical results to S3a.

I’ve also been wondering about the slight difference in polishing outcomes when sourcing from RC/MC2 vs PM in CS.

The FSC is improved vs un-polished particles in both cases, but RC/MC2 has been giving slightly better FSCs. I wondered earlier if it was stochastic, but I’m beginning to think that it’s due to the fact that RC/MC2 and PM do not share the same reference frame–PM has no reference frame. Meaning if particles are extracted from csMicrographs, the particles won’t be exactly where they should be when Relion goes back to the movies. That is, unless csparc2star.py corrects for that?

Perhaps another reason to stick to RC/MC2, for consistency.

Cheers,
Yang

Hi @DanielAsarnow

I gave this a go.

# Reference particle (Rot, Tilt, Psi, OriginX, OriginY)
-69.794205 65.145355 171.596817 -22.796064 12.710671

# Original csparc2star.py --flipy
69.794205 65.145354 171.596810 -22.796064 -12.710671

# Modified csparc2star.py --flipy
69.794205 65.145354 171.596810 -22.796064 12.710671

Polished against ±(flipZ+Y) maps/masks, unfortunately, both of which produced the same outcome as S3b.

Cheers,
Yang

Which step is the “reference” particle orientation from? What are the parameters for that particle in the other two scenarios?

I guess

[[1, 0, 0],
 [0, -1, 0],
 [0, 0, -1]]

might be the wrong transformation…or the idea about irreconcilable local motion differences is right (which it shouldn’t be, we should only have to care about the particle in the raw movie).

S3b particles following refinement against original reference (i.e. 3.1-3.2Å map), then subject to csparc2star.py conversion without any additional flags.

And to confirm that the change to csparc2star.py has had the desired effect, the following two lines are from the same input particles but converted with a) the original --flipy, and b) --flipy with line 95 commented out.

Cheers,
Yang

OK, got it. Those changes are expected then. They’re just not right somehow.

Beginning from the reference particle data from S3b, a favourable polishing outcome (see S3c above) can be attained with the following star file modifications:

(AngleRot)+180, 180-(AngleTilt), -(AnglePsi), OriginX, -(OriginY)

Apologies. I don’t know what the proper mathematical notation for it is.

At least with EMPIAR 11350, BPP seems largely insensitive to the fidelity of the motion initialization. However, what is different are the coordinates for where the particles are, even with --inverty properly applied.

RC/MC2 averages motion relative to the first movie frame. PM does not do this, as far as I can tell. So PM-derived coordinates will be off, possibly by quite a fair bit, when referencing the original movie frames.

A re-extraction from RC/MC2-derived micrographs probably allows this to be compensated for by absorbing the difference into the Euler angles and X/Y-shift transformations.

Cheers,
Yang

It’s the same transformation. Maybe the first-frame issue is key. I need to look at the polishing code again or maybe ask Takanori.

Apologies. Is it meant to be the same transformation?

Comparing the outputs with and without --flipy, other than the change to OriginY, the flag appears to have the effect of 1) multiplying AngleRot by -1, which is quite different from +180°, 2) leaves AngleTilt untouched, which is also quite different from 180°-angletilt (except where the angle in question is 90°), and 3) leaves AnglePsi untouched as well.

Or perhaps is the transformation the same but relative to a reference that is flipped in Z and Y?

Cheers,
Yang

1 Like

Great questions. The transformation is applied to the rotation matrix derived from the particle Euler angles, and they’re not different, because of the degeneracy of the Euler angles. Here’s how I found the transformation (aside from the 1,-1,-1 signature “making sense” given what we are trying to do).

import numpy as np
from pyem import geom
r1 = np.array([-69.794205, 65.145355, 171.596817])
r2 = np.array([-69.794205 + 180, 180 - 65.145355, -171.596817])
m1 = geom.euler2rot(*np.deg2rad(r1))
m2 = geom.euler2rot(*np.deg2rad(r2))
tf = m1.dot(m2.T)
print(tf)
[[ 1.00000000e+00 -7.15192393e-17  1.10567260e-16]
 [-4.96956493e-17 -1.00000000e+00 -9.66247528e-17]
 [ 2.02288402e-16  5.73985227e-17 -1.00000000e+00]]

So we see tf has that [1, -1, -1] on the diagonal.

But now I realize that this transformation only does what we want if it’s used on the left of the particle matrix. My transformation code multiplies on the right - because that is what you need if you want the rotation to apply to the map. Here though we are considering the particle - and that is the inverse matrix!

Thus, the solution is to invert (transpose) the particle matrices before multiplying by our special 1,-1,-1 identity-ish matrix, then transpose them again afterwards.

Can you please update your pyem, and then repeat your test? We’ll need to check with and without that line 95 comment.

1 Like

This map vs particle perspective explanation makes sense, although I’m only able to follow in abstract terms. :slight_smile:

Sure. I git clone-d the updated version, but csparc2star.py --flipy is returning the following error:

Traceback (most recent call last):
  File "<redacted>/pyem/csparc2star.py", line 177, in <module>
    sys.exit(main(parser.parse_args()))
  File "/<redacted>/pyem/csparc2star.py", line 97, in main
    df = star.transform_star(df, np.array([[1, 0, 0], [0, -1, 0], [0, 0, -1]]), leftmult=True)
  File "/<redacted>/pyem/pyem/star.py", line 586, in transform_star
    angles = np.rad2deg(rot2euler(newrots))
  File "/<redacted>/software/miniconda3/envs/pyem/lib/python3.9/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/<redacted>/software/miniconda3/envs/pyem/lib/python3.9/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
- Resolution failure for literal arguments:
reshape() supports contiguous array only
- Resolution failure for non-literal arguments:
reshape() supports contiguous array only

During: resolving callee type: BoundFunction(array.reshape for array(float64, 3d, A))
During: typing of call at <redacted>/pyem/pyem/geom/convert_numba.py (29)


File "<redacted>/pyem/pyem/geom/convert_numba.py", line 29:
def rot2euler(r, out=None):
    <source elided>
    epsilon = 1e-16
    r = r.reshape(-1, 9).reshape(-1, 3, 3)
    ^

Does the conda environment/dependencies require an update as well?

Cheers,
Yang

My double transpose is causing the array to not be byte contiguous, which is interfering with some optimized numpy code. I need to add a np.require call I think. I’ll have time in a couple of hours.

Thanks again for testing so gamely, I really appreciate it!

No problem. Happy to help test it. We picked it up hoping to avoid having to redo motion correction for an active dataset, but then got stuck in and have since redone it six ways to Sunday.

Cheers,
Yang

Sorry for the delay. Should work now.

Hi @DanielAsarnow

pyem newly git clone-d today.

Examplar particle data
# Reference particle (AngleRot, AngleTilt, AnglePsi, OriginX, OriginY)
-69.794205 65.145355 171.596817 -22.796064 12.710671

# Original csparc2star.py --flipy
69.794205 65.145354 171.596810 -22.796064 -12.710671

# Manual awk modification from S3c
110.206 114.855 -171.597 -22.796064 -12.7107

# Updated csparc2star.py --flipy 
110.205795 114.854646 -171.596810 -22.796064 -12.710671

# Updated csparc2star.py --flipy (# line 95)
110.205795 114.854646 -171.596810 -22.796064 12.710671

bfactor.eps from BPP looks good now. With --flipy as coded, refinement of shiny particles is working as well as in S3a/c. Relative to that, commenting out line 95 led to a slightly worse outcome–still better than unpolished.

I think the inversion of OriginY is the correct transformation. The fact that BPP didn’t blow up completely with the wrong Y-shift may be down to the translation errors being interpreted as, and subsequently absorbed into particle motion to a degree during tracking.

For reference, below is the rlnOriginYAngst distribution in this particular input particle data (pre-shiny).

Y-shift distribution

There still remains a disrepancy between PM- vs RC/MC2-derived data, with the latter producing better results. This could be down to the first-frame issue. It’s probably worth comparing with different datasets in order to corroborate.

Cheers,
Yang

Have you tested if recentering the coordinates has any impact? With star.py --recenter for example…BPP itself doesn’t do any recentering.

I think you’re right about some shift errors being correctable during tracking. If you imagine bringing each frame of a particle into register with a reference projection, then it could be that this represents a physical motion trajectory from frame to frame, or random shift errors on each frame. I think the extent to which it is feasible to align individual frames of a given particle determines the importance of a good initial trajectory. For example a ribosome could be alignable regardless of the initial trajectory while a smaller particle might require the restraints from other nearby particles or a very good initial trajectory.

Last question - if you run polishing iteratively, does the 2nd run based on PM catch up to the RC/MC2 results?

Thanks again for testing, it’s very satisfying to figure out what was wrong with my transformation before.