OK, just confirmed we have your latest version of pyem, pyem/20210729. I did export some time ago, may be that was an older version back then. Will try the new one, hopefully that would be enough for Relion. Will ask for more if there are still problems. Thanks Daniel
Hi Daniel,
I tried your latest version of pyem, it does produce files with helical tube ID. That is good, but I am getting completely different orientations with the new version compared to the older one (20210407). Rot, Tilt, and Psi are 6.466606, 62.377716, 167.797440 in the version of 20210729, and the previous version gave me -100.316521, 89.302757, 61.014317. Those angles were compatible with Relion, the new ones are not. Do you apply a transformation now and if yes, what kind? Is it possible to go back to the previous version with angles?
Forgot to tell you that when I tried to use csparc2star.py with exported cs file, it crashed:
csparc2star.py …/cryosparc/P59/exports/groups/P59_J214_particles/P59_J214_particles_exported.cs jnk_ltst.star
/opt/apps/PyEM/20210729/pyem/pyem/metadata.py:334: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify ‘dtype=object’ when creating the ndarray.
df[model[k]] = pd.DataFrame(np.array(
Columns must be same length as key
Traceback (most recent call last):
File “/opt/apps/PyEM/20210729/pyem/csparc2star.py”, line 42, in main
df = metadata.parse_cryosparc_2_cs(cs, passthroughs=args.input[1:], minphic=args.minphic,
File “/opt/apps/PyEM/20210729/pyem/pyem/metadata.py”, line 415, in parse_cryosparc_2_cs
df = cryosparc_2_cs_model_parameters(cs, df, minphic=minphic)
File “/opt/apps/PyEM/20210729/pyem/pyem/metadata.py”, line 334, in cryosparc_2_cs_model_parameters
df[model[k]] = pd.DataFrame(np.array(
File “/opt/apps/PyEM/20210729/lib/python3.9/site-packages/pandas/core/frame.py”, line 3597, in setitem
self._setitem_array(key, value)
File “/opt/apps/PyEM/20210729/lib/python3.9/site-packages/pandas/core/frame.py”, line 3634, in _setitem_array
check_key_length(self.columns, key, value)
File “/opt/apps/PyEM/20210729/lib/python3.9/site-packages/pandas/core/indexers.py”, line 428, in check_key_length
raise ValueError(“Columns must be same length as key”)
ValueError: Columns must be same length as key
Required fields could not be mapped. Are you using the right input file(s)?
Without exporting everything works well.
Thanks,
Michael
Are you comfortable commenting out line 432 from pyem/pyem/metadata.py and giving it a test?
That function is currently copying the filament_pose field to rlnAnglePsi. If that’s the issue we’ll need to determine a condition for preferring the regular model pose and the one angle from filament_pose. (I suspect it is the Psi angle prior from filament tracing).
I will try that if I can tonight. I did not install pyem here myself, our sysadmin did, but I could hopefully copy the code to my area and try that. Angles are one issue but he major one is still filament parameters that are outside of the poses.
@mmclean, could you comment on equivalents of rlnHelicalTrackLengthAngst, rlnAnglePsiFlipRatio, rlnAngleRotFlipRatio, and rlnAnglePsiFlip in helical part of cryosparc?
Thanks,
Michael
Hi Daniel,
Done. It looks like I still have same angles as before commenting that function.
New output:
data_optics
loop_
_rlnVoltage #1
_rlnSphericalAberration #2
_rlnAmplitudeContrast #3
_rlnOpticsGroup #4
_rlnImageSize #5
_rlnImagePixelSize #6
_rlnImageDimensionality #7
300.000000 2.700000 0.100000 7 440 1.100000 2
data_particles
loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
_rlnHelicalTubeID #15
_rlnOpticsGroup #16
_rlnRandomSubset #17
_rlnClassNumber #18
000001@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 235 2286 6.466606 62.377716 167.797440 -1.981132 -0.609071 26677.210938 26486.490234 240.004807 0.000000 0.000000 10863117171401233463 7 2 1
000013@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 989 1111 -4.214194 67.913658 179.628479 -1.975075 -2.031495 25757.007812 25566.287109 240.004807 0.000000 0.000000 6855374272098710089 7 1 1
000018@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 1100 951 119.069916 74.376167 62.575123 1.428970 -5.032431 25782.390625 25591.669922 240.004807 0.000000 0.000000 6855374272098710089 7 1 1
000032@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 1826 1044 -159.805939 65.202377 -36.752586 -1.624891 3.559670 25652.882812 25462.162109 240.004807 0.000000 0.000000 3780499217752641802 7 1 1
The old one:
data_optics
loop_
_rlnVoltage #1
_rlnSphericalAberration #2
_rlnAmplitudeContrast #3
_rlnOpticsGroup #4
_rlnImageSize #5
_rlnImagePixelSize #6
_rlnImageDimensionality #7
300.000000 2.700000 0.100000 7 440 1.100000 2
data_particles
loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
_rlnHelicalTubeID #15
_rlnOpticsGroup #16
_rlnRandomSubset #17
_rlnClassNumber #18
000001@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 235 2286 6.466606 62.377716 167.797440 -1.981132 -0.609071 26677.210938 26486.490234 240.004807 0.000000 0.000000 10863117171401233463 7 2 1
000013@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 989 1111 -4.214194 67.913658 179.628479 -1.975075 -2.031495 25757.007812 25566.287109 240.004807 0.000000 0.000000 6855374272098710089 7 1 1
000018@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 1100 951 119.069916 74.376167 62.575123 1.428970 -5.032431 25782.390625 25591.669922 240.004807 0.000000 0.000000 6855374272098710089 7 1 1
000032@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions_patch_aligned_doseweighted.mrc 1826 1044 -159.805939 65.202377 -36.752586 -1.624891 3.559670 25652.882812 25462.162109 240.004807 0.000000 0.000000 3780499217752641802 7 1 1
And the pyem version of 20210407 gives me:
data_optics
loop_
_rlnVoltage #1
_rlnSphericalAberration #2
_rlnAmplitudeContrast #3
_rlnOpticsGroup #4
_rlnImageSize #5
_rlnImagePixelSize #6
_rlnImageDimensionality #7
300.000000 2.700000 0.100000 7 440 1.100000 2
data_particles
loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
rlnOpticsGroup #15
rlnRandomSubset #16
rlnClassNumber #17
000001@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200
812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5
362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions
patch_aligned_doseweighted.mrc 235 2286 -100.316521 89.302757 61.014317 -1.981132 -
0.609071 26677.210938 26486.490234 240.004807 0.000000 0.000000 7 2 1
000013@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200
812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5
362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions
patch_aligned_doseweighted.mrc 989 1111 -127.449982 93.506439 56.392681 -1.975075 -
2.031495 25757.007812 25566.287109 240.004807 0.000000 0.000000 7 1 1
000018@J170/extract/5362876772985698_FoilHole_10032476_Data_10008448_10008450_20200
812_183916_Fractions_patch_aligned_doseweighted_particles.mrc J12/motioncorrected/5
362876772985698_FoilHole_10032476_Data_10008448_10008450_20200812_183916_Fractions
patch_aligned_doseweighted.mrc 1100 951 110.384628 80.877838 53.889832 1.428970 -5.
032431 25782.390625 25591.669922 240.004807 0.000000 0.000000 7 1 1
Michael
Hi @mbs,
Apologies for the delay – I can comment on what fields would be equivalent. Please note that this is based on the descriptions available here; I have not (yet) worked through this workflow of exporting segments to RELION and performing subsequent refinements in it. Also see my comment here for a description of all other fields.
-
filament/position_A
is the position of the particle along the filament contour, in Angstroms; should be equivalentrlnHelicalTrackLengthAngst
-
filament/filament_uid
: This associates each particle with an ID of a unique filament, similar torlnHelicalTubeID
. To the best of my knowledge, this should map one-to-one torlnHelicalTubeID
. The main reason for this parameter is for filament-based half-set splitting during refinement, which I believe is also the case in RELION. -
filament/filament_pose
: Approximate in-plane rotation angle (radians) between helical axis and x-axis. It should be possible to convert this torlnAnglePsiPrior
, but there may be convention differences. Likely, testing is needed to ensure the conversion is correct.
The orientation parameters are the same as in alignments3D/{pose,shift}
. CryoSPARC doesn’t use priors over any orientation parameters except for possibly tilt; thus there is no equivalent to rlnAnglePsiFlipRatio
, rlnAngleRotFlipRatio
, and rlnAnglePsiFlip
. Does RELION prescribe a default value for these fields that generates an effectively uniform prior? And if not, I’m curious if RELION allows for running helical refinements without priors over psi, rot, or tilt…
Best,
Michael
Hi @DanielAsarnow,
Thanks for the work on adding filament support to csparc2star. The filament_pose
field is indeed obtained from tracing. It should be probably be used to define the center of the psi prior (rlnAnglePsiPrior, but see my above comment), rather than the actual psi angle. The orientation parameters themselves (rlnAnglePsi, rlnAngleRot & rlnAngleTilt) should always be converted from the axis-angles in alignments3D/pose
.
Best,
Michael
Thank you Michael, so we have at least one more variable available for conversion to star. That is great, I could try to modify star.py in Daniel’s script and may be get rlnHelicalTrackLengthAngst
incorporated into conversion. Daniel is already putting rlnHelicalTubeID
into star, and also rlnAnglePsi
, which as you said may need to go to Prior instead of just Psi.
Daniel, it looks like you adding 90 deg to the angles during conversion, is there a reason for that?
It looks like Relion wants at least PSI_PRIOR_FLIP_RATIO to be present in star file:
if ( ( (is_3D_data) && (!MD.containsLabel(EMDL_ORIENT_ROT)) )
|| (!MD.containsLabel(EMDL_ORIENT_TILT))
|| (!MD.containsLabel(EMDL_ORIENT_PSI))
|| (!MD.containsLabel(EMDL_ORIENT_ORIGIN_X_ANGSTROM))
|| (!MD.containsLabel(EMDL_ORIENT_ORIGIN_Y_ANGSTROM))
|| ( (is_3D_data) && (!MD.containsLabel(EMDL_ORIENT_ORIGIN_Z_ANGSTROM)) )
|| (!MD.containsLabel(EMDL_ORIENT_TILT_PRIOR))
|| (!MD.containsLabel(EMDL_ORIENT_PSI_PRIOR))
|| (!MD.containsLabel(EMDL_PARTICLE_HELICAL_TUBE_ID))
|| (!MD.containsLabel(EMDL_PARTICLE_HELICAL_TRACK_LENGTH_ANGSTROM))
|| (!MD.containsLabel(EMDL_ORIENT_PSI_PRIOR_FLIP_RATIO))
|| ( (do_auto_refine) && (!MD.containsLabel(EMDL_PARTICLE_RANDOM_SUBSET)) ) )
REPORT_ERROR(“helix.cpp::updatePriorsForHelicalReconstruction: Labels of helical prior information are missing!”);
I can do some testing with my cs file.
Thanks,
Michael
Hi Daniel and Michael,
I added two columns to the output, rlnAnglePsiPrior, and rlnHelicalTrackLengthAngst as discussed. And generated rlnAnglePsiFlipRatio column with all 0.5 values. I saw these values in a star file generated by Relion while corresponding data was not used. Without all three additional columns Relion failed to refine, even with Rot, Tilt, and Psi columns renamed to "Prior"s. Got another error message “helix.cpp::updatePriorsForOneHelicalTube(): Helical segments do not come from the same subset!”
Not sure why since cryosparc should generate subsets accordingly. It looks like Relion people are reluctant to comment on the conversion, my next attempt would be to use cryosparc coordinates to box out segments in relion and try to refine after that. I might lose helical parameters in relion that way as well, but that is probably the only way to test if anything works with the export.
Thanks,
Michael
Hi Daniel,
What is the progress of exporting particles extracted from helical segment in cryosparc?
I tried the following:
Particles picked in relion has been exported to cryosparc and extracted from match motion corrected and ctf corrected images in cryosparc and that structure has been solved. I would like to then use the particles from cryosparc and the model for postprocessing and refinement in relion.
I used csparc2star.py on the particles in final helical refinement and imported them to relion. Runing 3Drefinement produces blobby map. I tried to see if the particles are correctly alligned so I tried to reconstruct the model from imported particles:
relion_reconstruct --i models/from_P8_J183.star --ctf --nr_helical_asu 14 --helical_twist 179.51 --helical_rise 2.35 --o ~/Desktop/reconstruct2.mrc
but that produces dust.
the exported start file looks like this:
loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnAngleRot #5
_rlnAngleTilt #6
_rlnAnglePsi #7
_rlnOriginXAngst #8
_rlnOriginYAngst #9
_rlnDefocusU #10
_rlnDefocusV #11
_rlnDefocusAngle #12
_rlnPhaseShift #13
_rlnCtfBfactor #14
_rlnHelicalTubeID #15
_rlnOpticsGroup #16
_rlnRandomSubset #17
_rlnClassNumber #18
000002@J143/extract/000001633912468430048_FoilHole_13486221_Data_13286457_13286459_20210814_010439_fractions_patch_aligned_doseweighted_particles.mrcs J124/motioncorrected/000001633912468430048_FoilHole_13486221_Data_13286457_13286459_20210814_010439_fractions_patch_aligned_doseweighted.mrc 1270 4281 -149.390411 70.092552 -62.249847 -9.528861 -11.377374 27736.617188 27611.583984 50.509022 -1.759584 0.000000 1 2 1 1
Sjors pointed out that _rlnPhaseShift should be 0 if the data are from K3. So something maybe wrong here.
Anyway was anyone successful in exporting helical particles from cryosparc to relion and what was the trick.
Best,
Witek
Hi @ketiwsim,
I did not get useful export of helical segments, but my goal was different from yours, I was going to reprocess segments in Relion and that did not work. Post-processing still might, but I did not try that.
Michael
I have never done any helical processing myself, but I previously consulted some experienced colleagues and implemented as much as I could. Just the priors and the helical track length are not converted at the moment; according to my friends it was enough to at least start reprocessing the same segments in Relion.
If you load the .cs files with numpy.load() you can look at the columns. Just let me know if you see any values that can be converted and I will add them!
Hi all,
I also encountered the same problem when I was trying to use relion to process the cryosparc picked helical particles. But the output file doesn’t contain the tilt and psi priors. Here is a solution to it. Dr. Wen Jiang has a software jspr which can be downloaded here (jspr – The Jiang Lab). It includes a python script that can convert cryosparc cs file into relion star file while preserving all the tilt and psi priors as well as helical ID.
All these can be done with one single command:
images2star.py cryosparc.cs relion.star
It can automatically detect the passthrough file and incorporate the information.
Hope it helps.
Chen
Hi @Chen the jspr script just sets the tilt to 90 and the rotation to 0. Then psi prior is set to the negative of the filament pose (with an option to add 180˚).
The folks who tested csparc2star.py said the correct conversion of psi was the negative of the filament pose plus 90˚. Are you sure the jspr one is actually right?
Other than that the only real difference is rlnAnglePsi vs rlnAnglePsiPrior. I can add the fields mentioned by @mbs to save you the extra command to add them.
I agree with Daniel, the actual problem was that Relion requires several other parameters to be able to run 2D classification on the imported segments, that info is missing in cryosparc. If you want to start from scratch, there is no problem there.
Hi all,
Just tried exporting some cryosparc picked filament coords into relion. It kind of works- although I’ve had to add a rlnAngleTiltPrior of 90 using relion_star_handler- and importantly for some applications, rlnHelicalTrackLengthAngst is missing. Is there any way to extract this currently? Perhaps via ‘filament/position_A’ as suggested by mmclean?
All the best,
Joe
Hi again- also- more troubling is that the rlnHelicalTubeIDs don’t seem to be quite right- for example sometimes there are more filaments on a micrograph than rlnHelicalTubeIDs… not sure why this is but it will cause issues in relion’s helical refinement…
Can you clarify? That wording sounds like some of the particles have been rejected at some stage, which is expected?
Hi Daniel-
This is just at the picking stage- and yes- particles have been rejected at the ‘inspect particles’ stage. The issue is- if there is a single long filament with excluded picks in the middle of it, giving two stretches of picks, what I’d need is for that filament to then be divided into two helical IDs, each for one stretch of picks (even if the same filament). This is needed in our case as we are working with relion scripts for processing pseudohelical microtubules where the phi angle is unique and we use information from neighbouring particles (segments) to help alignment- therefore we don’t want particles far away from each other (separated by rejected picks) on the same microtubule to influence each other. We used to just pick separate stretches manually, but want to move away from this with the large amounts of data that can be collected nowadays!
All the best,
Joe
You’re seeing the expected behavior as these particles really do come from the same filament pick.
In general the particle order might not correspond with their sequence along the filament, so it’s not that straightforward to reassign IDs in the way you want. I think the following algorithmic sketch is close.
For each micrograph:
- Calculate pairwise distance matrix between picks
- Convert distance matrix to graph adjacency matrix via an appropriate distance threshold (e.g. double the particle radius)
- Find all disjoint subgraphs using a flood fill
- Assign each disjoint subgraph a new unique helical tube ID
See this math exchange Q for example. The first answer is basically describing the row scan flood fill algorithm. The other name for disjoint subgraphs is “connected components” - you can find lots of other examples online. This library call could probably do it in one line: