Re-extract particles in Relion from CryoSPARC

Hi all,

I apologize for creating yet another thread about this topic, but even after reading all the other threads I’m lost. I’ve managed to use csparc2star.py to reliably export particles for use in classification/refinement. Daniel’s guide for this on the pyem GitHub page are clear. That being said, I would very much like to re-extract my particles in Relion to re-center and normalize them. In some of the other threads different flags are casually tossed out as things to try but with little to no explanation.

I have done motion correction, patch CTF, picking, particle extraction, 2D classification, ab initio reconstruction, and various attempts at refinement in CryoSPARC.

I have run both motion correction (Relion’s implementation) and CTF correction in Relion. I am hoping someone can explain which flags I need to use with csparc2star.py (and ideally why) in order to generate a .star file that will allow for re-extraction in Relion.

Once again, I apologize if this is retreading old ground but I feel like an up to date explanation for this would be helpful to many.

Thanks!

Hi,

That there are many hoops one has to consider jumping through stems from the fact that cryoSPARC treats TIFF-formatted data differently from how most other 3DEM packages, e.g. RELION, do. In short, TIFF inputs experience an inversion along the y-axis when read by the latter but not the former.

Consider the scenario where motion correction of a set of TIFF moviestacks is performed separately in cryoSPARC and in RELION. The two sets of averaged micrographs will constitute mirror-images of each other (in the x-axis).

The implications for the difference in TIFF-handling is two-fold. Firstly, in order for particle locations (i.e. coordinates) to be valid when moving between software packages, their y-component has to be inverted. Secondly, because all re-extracted particle images inherit the y-inversion of the micrograph, some CTF parameters (e.g. astigmatism angle), and the euler angles and translations that describe the relationship between a reference volume and the constituent particle image have to be flipped appropriately.

The --inverty flag addresses the first of these two implications. The --flipy flag addresses the second. They operate independently. --swapxy was hard-coded into the metadata-handling some time in the past as it’s always required, so the flag is not really discussed these days.

What flags to use alongside csparc2star.py depends on the use-case. Moving between the two packages doesn’t always necessitate transformation. For instance, if you’ve decided to import the stack of micrographs generated in RELION for particle-picking in cryoSPARC (rather than repeat MC in cryoSPARC), then there is no y-inversion nor y-flip to account for if moving back to RELION. Or, if you’re going to repeat global alignment search following re-extraction, then strictly speaking, only coordinate treatment has to be considered.

On the other hand, if you’re working with e.g. EER or mrcs data, then both software packages treat them the same and no transformation is required.

There is an additional complication regarding the --inverty flag. csparc2star.py’s metadata-handling is coded to automatically apply a y-inversion to the coordinates. So one has to apply an explicit flag if you do not want the coordinates to be inverted, i.e. un-invert the inversion. I believe this convention came about due to K3 input data, and therefore LZW-TIFF movie-stacks, being the most common at the time.

##cryosparc2.py, line 73 
def cryosparc_2_cs_particle_locations(cs, df=None, swapxy=True, invertx=False, inverty=True):

This is my understanding of the current state of play as a fellow end-user. YMMV.

Cheers,
Yang

1 Like

See here: Export from cryoSPARC v2 and later · asarnow/pyem Wiki · GitHub you will want to use –inverty and –flipy in your case if you are re-extracting from the micrographs in relion. This will preserve your coordinates, ctf information and alignments.

Hi Yang,

Thanks for this comprehensive response! I am indeed starting with EER data. If I am aiming to re-extract in Relion, taking locations from a one of my refine jobs in CS, does the fact that I am starting from EER matter or do I still need to apply transformations? I worry I’m over thinking things here as Daniel does provide instructions but I haven’t managed to get it to work.

I appreciate this, but I am absolutely aware of the GitHub wiki that Daniel provides. It is possible, and likely, that I am being a bit thick here, but I haven’t gotten this to work yet. I am going to restart from a clean slate and try to be a bit more meticulous with my steps.

Hi,

If I’m understanding your workflow correctly, your first action may be to consolidate a particle star file using an explicit --inverty flag. Remember, an explicit --inverty is necessary to maintain the coordinates rather than invert them. --flipy is not necessary as micrograph and particle images from both workflows should share a common orientation.

Once associated with your RelionCor-generated micrographs, re-extraction will maintain the CTF parameters inherited from the star file. I don’t think there’s a way to override this in a way akin to Force re-extract CTFs from micrograph in cryoSPARC, but it shouldn’t matter too much. Or if it does, then subsequent CTF refinement should be able to correct for it.

Also, to correct my previous comment, this isn’t technically true. Appropriate accounting of twofold astigmatism can have an influence on outcomes in certain scenario. However, one that should also be correctable post hoc with CTF refinement.

Cheers,
Yang

You’re looking for the --keep_ctfs_micrographs flag in RELION preprocessing I think. It can be added as an additional argument on the Running tab. :slight_smile:

1 Like

I see the instructions on the github, but have had some trouble following all of them. There are two input files in the star.py command, and I don’t understand which file to use for the orignal_particles.star. Which file other than the output from csparctostar file is needed?

In your case, there should not be an original particles.star file, you just need the particles.cs and the passthrough.cs from your job in cryosparc. The command should look like this:

csparc2star particles.cs passthrough.cs exported-particles.star

If you plan to re-extract in relion from the MotionCor2 mics then you need –inverty. If you are planning to use ctf parameters from cryosparc on the MotionCor2 mics, then you need –flipy.

I had no issues with the csparc2star command and can import my particles into RELION and run 2D classification, it’s just the re-extraction to move forward with Bayesion Polishing I am struggling with.

I did not include the inverty or flipy tag when I ran the original command, but the 2D classes look correct. Does running the re-extraction change that?

I edited the resulting star file to have the correct paths (included below) but still get that there are no particles when I run the extraction job (I have run motion and CTF correction in RELION and am using those micrographs)

data_optics

loop_
_rlnVoltage #1
_rlnImagePixelSize #2
_rlnSphericalAberration #3
_rlnAmplitudeContrast #4
_rlnOpticsGroup #5
_rlnImageSize #6
_rlnImageDimensionality #7
_rlnOpticsGroupName #8
300.000000 0.899000 0.010000 0.100000 4 400 2 opticsGroup4

data_particles

loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
_rlnRandomSubset #15
_rlnClassNumber #16
_rlnOpticsGroup #17
000001@J458/extract/000004238318968594446_FoilHole_28021190_Data_27992682_27992684_20230928_105436_fractions_patch_aligned_doseweighted_particles.mrcs MotionCorr/job002/Micrographs/000004238318968594446_FoilHole_28021190_Data_27992682_27992684_20230928_105436_fractions_patch_aligned_doseweighted.mrc 745 399 -95.782959 30.050669 69.062653 0.205800 0.175261 18375.367188 18296.585938 -41.837124 0.000000 0.000000 2 1 4

This is my issue as well. I am being told there are no particles when I try to extract in Relion. I noticed the _rlnMicrographName was pointing to files that didn’t actually exist, so I tried altering the entries with sed to reflect the names of the actual fils in the Movies directory, but Relion still says there are no particles.

EDIT: I’m wondering if this could be down to mismatches in opticsGroup? When I used csparc2star.py to convert obtain my particle metadata, I saw that the opticsGroup was set to 2 in the CS data. So when I imported my data and ran motion correction in Relion, I set the opticsGroupName to 2, but it appears that only altered the name, not the actual value. I tried using awk to change the actual values to 2 in order to match my particle metadata, but Relion still just ran quickly and output:

”Joining metadata of all particles from 15001 micrographs in one STAR file…
The pixel size of the extracted particles in optics group 1 is 0.59 Angstrom/pixel.
Written out STAR file with 0 particles in Extract/job005/particles.star
Done preprocessing!”

So it seems to have ignored that.

For completeness, the headers and first entry of my .star file are below. I used the --inverty --flipy --strip-uid --micrograph-path flags.

data_optics

loop_

_rlnVoltage #1
_rlnImagePixelSize #2
_rlnSphericalAberration #3
_rlnAmplitudeContrast #4
_rlnOpticsGroup #5
_rlnImageSize #6
_rlnImageDimensionality #7
_rlnOpticsGroupName #8
300.000000 0.590000 2.700000 0.100000 2 416 2 opticsGroup2

data_particles

loop_

_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
_rlnOpticsGroup #15
_rlnRandomSubset #16
_rlnClassNumber #17

000001@J1024/extract/FoilHole_3299146_Data_3297369_0_20251003_120448_EER_patch_aligned_doseweighted_particles.mrcs MotionCorr/job002/Movies/FoilHole_3299146_Data_3297369_0_20251003_120448_EER.mrc 2161 2014 173.733813 66.147917 90.224647 4.550140 -2.605511 24830.396484 24251.833984 -34.124832 0.000000 0.000000 2 1 1

Mine is optics group 4, but RELION does seem to recognize that. The final error message I get is “optics group “opticsGroup4” will be removed because no extracted particle belong to it.

At the very least I’m happy to know I’m not the only person running into this. The GitHub makes it seem like this part of it should be very simple.

I think the issue is that the optics group in your relion micrograph star files does not match the optics group exported from cryosparc. Change them so that they match and see if that fixes it.

I did actually try that, but for some reason even with the corrected_micrographs.star edited to be opticsGroup 2 (in order to match my CS files), Relion seems to be reading it as opticsGroup 1. Not sure if there is another spot or another file I might need to edit to solve this.

It must be different in one of the columns or headers. Check the header in data_optics and check you optics group column in both the micrograph star file and the particles star file. It must be different in one of them.

I have also tried updating the optics group to match. I will put both down below:

The micrographs file from RELION’s motion correction:

version 50001

data_optics

loop_
_rlnOpticsGroupName #1
_rlnOpticsGroup #2
_rlnMtfFileName #3
_rlnMicrographOriginalPixelSize #4
_rlnVoltage #5
_rlnSphericalAberration #6
_rlnAmplitudeContrast #7
_rlnMicrographPixelSize #8
opticsGroup1 1 ../HK2_OSU_KRIOS_SEPT2023/mtf_K3_300kV.star 0.449500 300.000000 0.010000 0.100000 0.899000

version 50001

data_micrographs

loop_
_rlnCtfPowerSpectrum #1
_rlnMicrographName #2
_rlnMicrographMetadata #3
_rlnOpticsGroup #4
_rlnAccumMotionTotal #5
_rlnAccumMotionEarly #6
_rlnAccumMotionLate #7
MotionCorr/job002/Micrographs/FoilHole_27993640_Data_27992682_27992684_20230927_152839_fractions_PS.mrc MotionCorr/job002/Micrographs/FoilHole_27993640_Data_27992682_27992684_20230927_152839_fractions.mrc MotionCorr/job002/Micrographs/FoilHole_27993640_Data_27992682_27992684_20230927_152839_fractions.star 1 33.519103 0.000000 33.519103

The file from my particles.star file

data_optics

loop_
_rlnVoltage #1
_rlnImagePixelSize #2
_rlnSphericalAberration #3
_rlnAmplitudeContrast #4
_rlnOpticsGroup #5
_rlnImageSize #6
_rlnImageDimensionality #7
_rlnOpticsGroupName #8
300.000000 0.899000 0.010000 0.100000 1 400 2 opticsGroup1

data_particles

loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
_rlnRandomSubset #15
_rlnClassNumber #16
_rlnOpticsGroup #17
000001@J458/extract/000004238318968594446_FoilHole_28021190_Data_27992682_27992684_20230928_105436_fractions_patch_aligned_doseweighted_particles.mrcs MotionCorr/job002/Micrographs/000004238318968594446_FoilHole_28021190_Data_27992682_27992684_20230928_105436_fractions_patch_aligned_doseweighted.mrc 745 3693 -95.782959 30.050669 69.062653 0.205800 0.175261 18375.367188 18296.585938 -41.837124 0.000000 0.000000 2 1 1

So I did in fact manage to get my particles extracted after re-running the awk/sed commands that Claude cooked up to alter my files. My question now is - is this right!?

The refinement job I took the particle metadata from was using particles extracted from CS patch motion and CS patch CTF refined micrographs. I was looking at the example for Bayesian polishing, but I am re-extracting in order to try 2D/3D classification with re-centered/normalized particle images. Only going to continue on to Bayesian polishing if the 3D class/refine with Blush push the model to where it needs to be.