Bayesian Polishing of cryosparc-processed particles

stefan · July 23, 2019, 8:49pm

Hi!

I have run in a roadblock in switching cryosparc and relion for the purpose of performing different processing steps on the same dataset. In particular, I am struggling to produce the correct star files as inputs for Bayesian Polishing. Please see my case below; I would really appreciate if anyone has suggestions about good workflows or scripts to make the transition workable.

Example scenario:
I import, patch motion correction, patch ctf estimation, particle picking and extraction, 2D classification, ab initio reconstruction and homogenous refinement in cryosparc. I find it to be a fast and robust way of getting an first pass reconstruction and good particle set.

Then, I use cspar2star.py (thanks @DanielAsarnow !) to export the stack from the .csv produced from the refinement and, with minor adjustments, import the particles into relion to harness the power of its 3D classifier, 3D auto-refine, beam tilt correction / ctf-refine.

Now, I would like to perform Bayesian Polishing, which requires (nominally, please correct me if I am wrong):
(1) a micrograph star file that points to motion-corrected mrc files with associated metadata files (is it possible to generate these for the cryosparc patch motion correction outputs?)
(2) a particle star file (e.g. from CtfRefine or 3DRefine)
(3) a PostProcess star file

Is it possible to use the cryosparc motion corrected micrographs in any way? Or is it easier to re-run the motion correction through the relion GUI, and then edit star files to point to the relion MotionCorr outputs rather than to the cryosparc ones? If so, will particle coordinates be compatible between the two outputs?

Thank you for reading and thank you for all past/future advice!

Stefan

olibclarke · July 23, 2019, 9:32pm

I think you will need to re run motion correction through the relion GUI, as you say - the alignment data used in Patch Motion is not saved anywhere AFAIK, at least not in a format that relion can read. Re-extraction of particles should work fine, but easy enough to test, just run a class2D straight after re-extraction to confirm all is good.

Cheers
Oli

stefan · July 23, 2019, 9:48pm

Thanks Oli! Will give it a shot asap.
I figured that if there was a solution, you’d be the one to have it…

stefan · August 4, 2019, 12:02am

I’ve tried everything that comes to mind, with no luck…

I’m dealing with energy-filtered K3-acquired tiffs. If I compare the mrcs output from motioncorrection in cryosparc and relion, they are in the same orientation (no rotaion of mirroring).

I’ve noted csparc2star.py converts the cryosparc metadata (either from homogeneous refinement or class select) to a star file that reports X,Y coordinates that do not fit onto the image dimesions unless they are inverted (@DanielAsarnow, I’d be happy to share some images if have interest in replicating the issue) . With the axes inverted, I have tried extracting with all combinations of flipped x adn y axes (none, x-only, y-only and both x and y axes flipped), with no success.

I am wondering if this is an issue with how cryosparc or pyem handles energy-filtered K3 tiffs?

DanielAsarnow · August 5, 2019, 4:44am

Use --swapxy . Cheers.

stefan · August 7, 2019, 6:03am

Thank you for the advice; this equivalent to swapping _rlnCoordinateX with _rlnCoordinateY in the star file, right?

DanielAsarnow · August 7, 2019, 7:05am

No, it’s not (and as you already found that produces incorrect results). Feel free to look inside if you want to know how it works.

apunjani · August 13, 2019, 2:43pm

Hi all,
Thanks @DanielAsarnow and @olibclarke for clarifications. CryoSPARC Patch Motion does in fact write out motion trajectories (as plain numpy array files in the job output dir) but Relion or other programs will not know how to read these.

Also the reason for the --swapxy (I believe - looking through pyem code) is that in cryoSPARC, we use the convention that 2D/3D arrays are always stored in C-order, with the fastest axis being the “x” axis, and we use a right hand coordinate system. That means that 2D arrays are stored with their slow (first) axis being Y, and second (fast) axis being X. Therefore the “shape” of a 2D array will be [ny,nx] and a 3D array shape will be [nz,ny,nx].
Vectors that describe a position (in 2D or 3D) also follow the right-hand convention, meaning that they are stored in standard geometry order, (x,y,z). So particle locations are stored as (x,y) pairs, while the shape of the micrograph is stored as [ny,nx] to maintain the conventions.

DanielAsarnow · August 13, 2019, 5:28pm

@apunjani That’s all what I expected, but folks reported swapped coordinates sometimes and sometimes not (and at a certain point I swapped whether --swapxy was set true or set false on actually swapping). It seems the difference is whether or not movies or micrographs are originally imported, but I felt it was easier to add the option and try both than to figure it out.

user123 · January 8, 2020, 3:41pm

Hi Dan,
It seems like the Y axis needs to be inverted when taking particle coordinates from cryosparc to relion for re-extraction (particles were picked and extracted on imported mics in cryosparc 2.12.4, now I want to extract the same locations in Relion on a new set of mics). I used swapxy to make the star file (otherwise “particle lies completely outside micrograph”) and newly extracted particles were junk. Upon viewing a single micrograph in relion_display v cryosparc the y axis is flipped. Does that seem correct and is there a quick command to flipY for extraction?
Thanks!

user123 · January 8, 2020, 6:37pm

Anyone coming across the same problem, you can use awk as follows to invert the y axis.
awk ‘{$4=4092-$4; print $0}’ in.star > out.star, where $4 is _rlnCoordinateY and 4092 is the y axis boundary. You’ll then need to manually delete the “4092” added to the header.

stefan · April 4, 2020, 12:54am

Hi All!

I’ve finally gotten back to bring this issue back from the grave; I am still having a hard time keeping particle coordinates straight in switching form cryoSPARC to Relion. I would sleep better at night if the wonderful community on here can help me crack this one…

I imported MotionCor2 processed micrographs into cryoSPARC and CTF estimation, particle picking and multiple steps of classification. Now I want to re-extract the particles in Relion and have resorted to csparc2star.py, feeding it the particles and passthrough files form a homogenous refinement.

As @user123 pointed out, it is necessary to run csparc2star.py with the --swapxy flag (otherwise you end up with coordinates that are out of bound), but that the y-axis needs to be flipped as well, as illustrated by the two images below.

The left image is from the MotionCor2 output, and the right image is the same micrograph imported and picked on in cryoSPARC. Ignoring the global differences stemming from the grayscale range and filtering, the two contaminant features nicely show that the y-axis needs to be flipped. In my case, the images are 4x binnned and the y-axis length is 2046, so I use awk to replace the rlnCoordinateY column with from 2046 – rlnCoordinateY values.

To test if it the extraction targeted the intended particles, I re-import the extracted particle stack into cryoSPARC and run a 2D classification, which I compare with a 2D classification of the original particle set. Much to my disappointment, what comes back is not the clean particle set that I thought I was extracting in Relion. I have tried resetting the refined offsets, as well as re-centering the particles to (0,0,0), but to no avail.

What else could be going wrong?

apunjani · April 9, 2020, 5:08pm

Hi @stefan,

Thanks for your detailed post.
One simple question to start: did you already try performing the relion extraction without flipping the y-axis values, but only with --swapxy ? I ask because it’s likely that hte Motioncor2 image display draw the y axis increasing downwards, while cryoSPARC draws x increasing to the right and y increasing upwards (i.e. right handed coordinate system). In this case the image looks flipped in the image display, but in terms of raw-on-disk order, it’s actually the same, and so the y coordinates of picks should not be flipped.

In general, the convention for x, y, z that cryoSPARC uses is the same as what Relion uses for particle picking (and also for shifts, poses, etc).
The one case where there is a difference (and this does not seem to be applicable in your case) is with .tiff files.

Motioncor2 and Relion read the raw-on-disk .tiff file, and then flip the order of “rows” (y-axis) of pixels.
CryoSPARC reads .tiff files from raw-on-disk order and does not change the order of rows.
Thus, when motion correction is applied to .tiff files in Motioncor2, the output micrographs are raw-on-disk reverse of when motion correction is applied to .tiff files in cryoSPARC. But in your case, you imported the motion corrected micrographs into cryoSPARC, so this is not applicable - in your case the cryoSPARC x and y axes are the same as what Relion uses during extraction (to my knowledge).

stefan · April 10, 2020, 8:07am

Hi @apunjani,

Thank you for the fast and detailed response!

In brief, you are right. The apparent flip in the ordinate is just due to how relion_display and cryoSPARC define its direction.

I was prompted to consider flipping the axis because my initial “extraction” of particles was not refining to sensible 2D classes. I was using the current sbgrid version of pyem, and that turns out it is quite out of date.

I have not (yet) investigated the underlying reason, i.e. don’t quote me, but since updating pyem, the particle extraction seems to be working properly. Sorry for stirring the pot when I had not done due diligence on my end, and thank you once again for looking into it.

Cheers,
Stefan

lalmagor · November 19, 2020, 3:37am

Hi,
I would like to use Relion’s Bayesian Polishing feature on Particles extracted from a Cryosparc Ab-initio reconstruction using csparc2star.py. After reading the above conversations I am a bit confused and I am not sure if the mentioned required steps are relevant to the current versions of the programs.

If there is anyone experienced in following a similar path, I would appreciate it if you could let me know whether the following strategy I am planning to use will work:

(1) run motion correction in Relion with the original .tiff movies
(2) use csparc2star.py on the Cryosparc .cs particle file with the --swapxy option to create a particle .star file for Relion.
(3) generate an ab-initio model, refine it, and post-process in Relion using the converted .star file.
(4) Run Bayesian Polishing using the Relion motion correction .star, the imported particles .star, and the post-process .star files.

Thanks,
Lior

user123 · November 19, 2020, 4:16am

Hi Lior,
Yes, this looks like a good pathway. You may want to test your particle coordinates before running the polish job to make sure they hit the actual particles. In some instances you will need to invert the y axis. I usually import the new star file and make a small subset of 1k particles, extract and run 2D classification to make sure these are my real particles. If the classes are junk, then need to reassess the coordinates and not invert the y axis.
Also, I believe bayesian polishing will give best results if you use relion_reconstruct to make half maps for post-process, rather than use the cryosparc half maps. So the pathway would be csparc2star, subset select into 2 random subsets, relion reconstruct for each subset, then postprocess and polish.
Best,
Aaron

lalmagor · November 19, 2020, 5:34pm

Hi Aron,
Thanks for your answer.
I am usually using the relion_star_handler to remove the rlnRandomSubset column from my Cryosparc converted particle .star to let Relion reform it. I assume that should work for the bayesian polishing as well?
Best,
Lior

user123 · December 1, 2020, 6:35pm

I think that should be fine. Did you get it to work?

jaremko · February 6, 2021, 9:23pm

Hi @apunjani,

I have an issue with case 2 here. Is there a easy way to flip the y axis?

Thanks,
Matt

DanielAsarnow · February 7, 2021, 2:09am

csparc2star.py has --flipx and --flipy in addition to --swapxy. Note these arguments act on the normalized coordinates used by cryoSPARC; it’s not the same as swapping X and Y after converting to pixel coordinates.