Experimental support for Relion's Bayesian polishing in csparc2star.py

I think --flipy is needed here (in addition to --inverty) because you changed between MC2 mics and Patch Motion mics, which are going to be physically flipped. Aside from the new defocus change, --flipy only affects alignment parameters that weren’t used by your new refinements.

So if you reconstruct the post-polishing particles without refinement, and do the same with a new file created by --flipy (you can just change the particle paths to the polished ones), then the latter should come out better. New refinements with the latter should also go directly to 2.2 Å and not be stuck at 2.9 Å.

This is the flipy code:

1 Like

Hi Dan,

When you say --flipy, isn’t this the default behavior now, without explicitly adding a flag? In any case, it seems like it is fixed now, thanks!

I double checked, and the original csparc2star.py job was run without flags, which resulted in Y-flipped coords that matched the Patch Motion mics (which the original coords in Csparc did not), but which gave the pseudo-tetrafoil pathology described above (and limited resolution to 2.95Å).

Re-running the same command with the new version gives the correct results, and resolution goes straight to 2.2 Å, same as when I re-extract with force-extract CTFs - thanks so much for your help!

Cheers
Oli

Hi Oli and Dan,

Just to clarify, with the latest commit, when is the defocus angle triggered to flip? Skimming the changes to csparc2star.py suggests it’s implemented with --flipy. However, is --flipy active without an explicit flag? My understanding was that it’s not enabled without the flag.

I’m considering scenarios in which no transformations beyond re-inverting the --inverty are required, e.g. importing averaged micrographs and exporting particle data.

Cheers,
Yang

I don’t know, but something definitely changed without flags, because before it didn’t work (gave the anomalous behavior), and now (with the new build) it works

Nothing changed without flags*, just the defocus flip was added to the --flipy block. Default --flipy is false and default --inverty is true. --inverty only changes coordinates so 1) if it’s wrong you’re not going to get good particles (2DCA should be much worse) and 2) it won’t help if particles are upside down.

*I did unintentionally made --inverty a no-op, but it was fixed over a year ago I think.

Hmm - this is confusing then (I must have stuffed something up somewhere I guess but cannot see where). I will investigate and report back. Sorry!

Or it was from a version with wrong/nonfunctional --inverty semantics?

Ok so I was doing something stupid during testing (running test refinements without re-extracting :man_facepalming: ).

Once I tested with re-extraction, the new version no longer has the pseudo-tetrafoil pathology, all good I think.

Shouldn’t it be the same for inverty though? Won’t that also need the same operation (assuming we are re-extracting from flipped mics)

Shouldn’t it be the same for inverty though? Won’t that also need the same operation (assuming we are re-extracting from flipped mics)

If you have particles in CS and just want to export to Relion, then import and re-extract in CS for further processing, you need --inverty alone. (–inverty results in keeping the same coords across repeated export/import cycles).

In the polishing / changing motion correction case then you also re-extract from the physically flipped micrographs - thus adding --flipy.

1 Like

right - but both --inverty (no flag) and --flipy are giving visually identical re-extraction results, it is just that with --inverty the pseudo-tetrafoil artefact remains, while with --flipy it does not. I feel like I am missing something basic…

If I compare the x,y coords for --flipy and --inverty, the coords are identical, it is only the orientations that have changed:

header:
data_optics

loop_
_rlnVoltage #1
_rlnImagePixelSize #2
_rlnSphericalAberration #3
_rlnAmplitudeContrast #4
_rlnOpticsGroup #5
_rlnImageSize #6
_rlnImageDimensionality #7
_rlnOpticsGroupName #8
300.000000 0.846000 0.001000 0.100000 9 384 2 opticsGroup9

data_particles

loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
_rlnOpticsGroup #15
_rlnRandomSubset #16
_rlnClassNumber #17

--flipy
000001@J699/extract/000002012550007131087_m23jan02c_protein_gr3_00001gr_00060sq940_v02_00004hln_00012enn.frames_patch_aligned_doseweighted_particles.mrc J689/motioncorrected/000002012550007131087_m23jan02c_protein_gr3_00001gr_00060sq940_v02_00004hln_00012enn.frames_patch_aligned_doseweighted.mrc 5498 1820 59.355159 133.586372 -35.855327 0.270295 0.362194 15679.165039 13611.887695 -268.947632 0.000000 0.000000 9 2 1
000001@J699/extract/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted_particles.mrc J689/motioncorrected/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted.mrc 1559 1517 82.271126 123.746246 125.727255 -0.244724 0.371624 15213.296875 14574.656250 -29.939199 0.000000 0.000000 9 2 1
000002@J699/extract/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted_particles.mrc J689/motioncorrected/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted.mrc 3703 2280 -36.667250 120.965685 -150.369977 0.140119 0.171844 15233.381836 14594.741211 -29.939199 0.000000 0.000000 9 1 1


--inverty
000001@J699/extract/000002012550007131087_m23jan02c_protein_gr3_00001gr_00060sq940_v02_00004hln_00012enn.frames_patch_aligned_doseweighted_particles.mrc J689/motioncorrected/000002012550007131087_m23jan02c_protein_gr3_00001gr_00060sq940_v02_00004hln_00012enn.frames_patch_aligned_doseweighted.mrc 5498 1820 -120.644836 46.413628 35.855328 0.270295 -0.362194 15679.165039 13611.887695 268.947632 0.000000 0.000000 9 2 1
000001@J699/extract/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted_particles.mrc J689/motioncorrected/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted.mrc 1559 1517 -97.728874 56.253754 -125.727249 -0.244724 -0.371624 15213.296875 14574.656250 29.939199 0.000000 0.000000 9 2 1
000002@J699/extract/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted_particles.mrc J689/motioncorrected/000005093389571592472_m23jan02c_protein_gr3_00001gr_00058sq940_v03_00009hln_00022enn.frames_patch_aligned_doseweighted.mrc 3703 2280 143.332748 59.034313 150.369980 0.140119 -0.171844 15233.381836 14594.741211 29.939199 0.000000 0.000000 9 1 1

In this case it would seem that for both cases the defocus angle negation would be required when re-extracting from flipped mics? Sorry if I am being dim

–inverty vs. no --inverty causes the y-coord to be inverted (subtracted from the y size)
–flipy doesn’t do anything to the coordinates (but the alignment parameters, and now the defocus angle, that follow the y-coordinate are different).

Right - so if --inverty is the default with no flags, and --flipy does nothing to the coordinates, then I think both should have negated defocus angle? Or maybe just have it as a separate flag (because whether the defocus angle needs to be changed depends on if you are re-extracting or not)

Because in both cases if you extract, you are extracting from flipped mics, where the defocus angle is changed, whereas if you are just using the already extracted particle stack (from the original mics) you will want to keep the defocus angle constant.

If you have just done a cryoSPARC workflow, and then do a classification in Relion, and then continue in cryoSPARC including future re-extractions then you need --inverty but the micrographs are never physically inverted so the alignments and defocus don’t need adjustment. I believe that is the most common use case. When cryoSPARC originally shipped you would always import MC2 corrected micrographs and external coordinates, so that’s still the “no arguments” case.

If you have physically inverted micrographs - because motion correction has been rerun in another program - then the alignment parameters and defocus angles need to be flipped (no matter what coordinates are used).

1 Like

I think that’s all totally fair, but I still think it may be a little confusing for users, because both options are compatible with extraction from flipped mics (the coordinates are the same, and we don’t care about the orientations if we are re-doing refinement anyway).

Maybe worth printing an info message at runtime indicating whether or not the defocus angle is being modified for compatibility with extraction from flipped mics? Maybe I’m overthinking it though. Anyway, --flipy works great!

Cheers
Oli

If I may offer an observation on convention related to this point.

I’ve noticed that our users are often confused by the inverty in-the-background operation (i.e. sans explicit flag) especially, but not limited to when working on non-TIFF movies. It’s a not-insignificant effort explaining to them why, in practice, they have to inverty (i.e. with flag) in order to not inverty. In effect, they have to understand 1) the original basis for the inversion to begin with, 2) what cryosparc2.py does behind the scenes, and 3) how that convolves with their specific circumstance.

Are there scenarios in which the inverty and flipy operations (not flags) have to be applied singularly? It seems to me that flipy always necessitates inverty. And situations that require inverty are, at worst, neutral to the flipy operation. Perhaps it would make some sense to wrap both operations into one, and have the no-argument case be otherwise passive? Unless I’ve missed something.

(EDIT: or to maintain modularity, have the --inverty flag trigger the inversion rather than negate a background process.)

As an aside, this seems to be becoming more frequent with the growing popularity of EER, due to TFS pushing the F4EC and core facilities deciding it’s easier to deal with one vendor (microscope+detector) than face having TFS and Gatan busy blaming each other when issues arise with a K3 installation.

Ultimately, we’re very grateful to even have the pyem repository to call upon.

Cheers,
Yang

EDIT: for formatting. Lacking the appropriate punctuation marks on my phone previously.

1 Like

Yes, again, if you have a pure cryoSPARC workflow except for e.g. Class3D in Relion, then you must use only the --inverty argument. Clearly this is the most common use case (as of cryoSPARC 2.5+ or something), but changing the semantics of the argument would break every past script or protocol, which IMO is the worse consequence.

CryoSPARC and Relion use inverted y-axis conventions, and MC2 and RelionCorr output flipped micrographs. All we can do is make them work together despite these differences. Now, a real possibility is to make --flipy imply --inverty (but not the reverse). Another is to add new aliases for the arguments that are more clear, perhaps. “–invertycoords” and “–flipparticles.”

1 Like

Thanks for implementing this functionality, this will be very helpful.
I am encountering a strange error when I try polishing. When I try searching for this error the only hit is from the Relion source code, so I am not sure how to troubleshoot this. I am attaching the run.err and an example movie metadata star file. Any ideas?

MetaDataTable::setValueFromString BUG: offset should not be negative here....

Thanks,
Alan

Hm, I want to say that is actually caused by a space or extra line or something like that causing an extra field / misaligned field. Can you delete _ucsfUid?

That fixed it! Thanks.

Hi Daniel, (@DanielAsarnow)

Having some trouble of my own with csparc2star for movie output/Bayesian polishing prep. Would appreciate you pointing out where I’ve messed up with it.

First problem is(/was?) that the exported *.cs file direct from CryoSPARC has a > before the path to the micrographs (i.e.: CS-projectName/exports/jobs/J4_export_live_exposures/>S1/motioncorrected/*traj.npy) which makes it fail with “file not found”. If I fix the path by renaming or symlinking S1 to >S1 the same error occurs (not surprising, Linux doesn’t like > in the path), but if I fix the path(s) in the *.cs file, a different error occurs. Editing all occurrences of >S1 to just be S1, a numpy error occurs - ValueError: cannot reshape array of size 12930 into shape (13041,). 13041 is the total micrographs in the dataset, 12930 are the total accepted micrographs. The same thing happens whether I feed csparc2star the all_exposures or accepted_exposures .cs file.

This is with the latest CryoSPARC (4.3.1) although I always remember having the > from any csparc2star output, I’ve just always replaced it with sed because I’ve never tried this polishing trick before.

Thanks in advance for your help.