How to convert cs format to star format in cryosparc v2?

Hi everyone,
I was wondering if there is any scripts like “pyem-master” written by Daniel Asarnow to convert the csv format file back to star format file keeping x y coordinates?

Thanks a lot in advance!

Best wishes,

xing

Hi Xing, pyem is the name of the project, and master is the name of the branch in the version control system. The specific program you are looking for is a command-line utility called csparc2star.py, which is part of that project.

csparc2star.py is compatible with cryoSPARC 0.6.5 and cryoSPARC 2+. In cryoSPARC 2, the metadata files have extension .cs and are located in the job directories within the project directory. Sometimes, you need to pass both the final iteration .cs file for the particles and a second file, usually called “passthrough,” which is done with the --passthrough argument. When you need the micrograph name and x, y coordinates for re-extraction, there is another argument called --copy-micrograph-coordinates which lets you provide the original star file to access these fields.

If you search these forums, you will find some extensive discussion about it. There are also some short tutorial in the wiki on the pyem github page.

Hello @DanielAsarnow,
I seem to be having a little trouble with pyem. I use the --passthrough command however there is still a request for it (please see below). Could you please let me know what I am doing wrong?

Thank you for your time!
Kellie

$ /data/CRYOSPARC/pyem/csparc2star.py --passthrough /data/CRYOSPARC/USERS/KELLIE/CRYOSPARC_projects/P1/J771/passthrough_particles.cs /data/CRYOSPARC/USERS/KELLIE/CRYOSPARC_projects/P1/J771/extracted_particles.cs from_cryosparc.star --loglevel debug
Detected CryoSPARC 2+ .cs file
Reading passthrough file
Particle passthrough detected
Concatenating passthrough fields: location/micrograph_uid, location/micrograph_path, location/micrograph_shape, location/center_x_frac, location/center_y_frac, ctf/type, ctf/accel_kv, ctf/cs_mm, ctf/amp_contrast, ctf/df1_A, ctf/df2_A, ctf/df_angle_rad, ctf/phase_shift_rad, ctf/scale, ctf/scale_const, ctf/cross_corr_ctffind4, ctf/ctf_fit_to_A, ctf/fig_of_merit_gctf, alignments3D/split, alignments3D/shift, alignments3D/pose, alignments3D/psize_A, alignments3D/error, alignments3D/error_min, alignments3D/resid_pow, alignments3D/slice_pow, alignments3D/image_pow, alignments3D/cross_cor, alignments3D/alpha, alignments3D/weight, alignments3D/pose_ess, alignments3D/shift_ess, alignments3D/class_posterior, alignments3D/class, alignments3D/class_ess, alignments2D/split, alignments2D/shift, alignments2D/pose, alignments2D/psize_A, alignments2D/error, alignments2D/error_min, alignments2D/resid_pow, alignments2D/slice_pow, alignments2D/image_pow, alignments2D/cross_cor, alignments2D/alpha, alignments2D/weight, alignments2D/pose_ess, alignments2D/shift_ess, alignments2D/class_posterior, alignments2D/class, alignments2D/class_ess, motion/type, motion/path, motion/idx, motion/frame_start, motion/frame_end, motion/zero_shift_frame, pick_stats/ncc_score, pick_stats/power, pick_stats/template_idx, pick_stats/angle_rad
Creating particle DataFrame from recarray
Directly copied fields: rlnDefocusAngle, rlnDetectorPixelSize, rlnCtfFigureOfMerit, rlnSphericalAberration, rlnAmplitudeContrast, rlnMicrographName, rlnCtfMaxResolution, rlnVoltage, rlnDefocusU, rlnPhaseShift, rlnDefocusV, rlnImageName, rlnMagnification
Converting normalized particle coordinates to absolute
Converted particle coordinates from normalized to absolute
Converting DEFOCUSANGLE from degrees to radians
Converting PHASESHIFT from degrees to radians
Collecting particle parameters from most likely classes
Columns must be same length as key
A passthrough file may be required (check inside the cryoSPARC 2+ job directory)
Columns must be same length as key
Traceback (most recent call last):
File “/data/CRYOSPARC/pyem/csparc2star.py”, line 42, in main
df = metadata.parse_cryosparc_2_cs(cs, passthrough=args.passthrough, minphic=args.minphic)
File “/data/CRYOSPARC/pyem/pyem/metadata.py”, line 349, in parse_cryosparc_2_cs
[cs[names[c]][i] for i, c in enumerate(cls)]))
File “/data/CRYOSPARC/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py”, line 3116, in setitem
self._setitem_array(key, value)
File “/data/CRYOSPARC/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py”, line 3138, in _setitem_array
raise ValueError(‘Columns must be same length as key’)
ValueError: Columns must be same length as key

Hi @Kellie, what kind of job is it? Is there another passthrough file in the job directory you can try?

The error is caused by the passthrough and particles files having a different number of particles. That step is done via a straight concatenation so it’s not robust to the difference. Maybe I will change this to a real merge to avoid the problem.

Thank you @DanielAsarnow for your fast reply!
That job specifically is an extraction job, in that directory there is also a passthrough_micrographs.cs file however that one ended up with a length mismatch error (let me know if you would like to see the debug of that one).

I have also tried csparc2star.py on a different job (a local motion correction) and I get the same exact error.

Thank you for your time!
Kellie

First, make sure you’re tracking the release branch of pyem, as it is stable.

Second, fastest way to deal with the issue for the extraction job is to convert the passthrough file itself to a star file, and then convert the particles file with --copy-micrograph-coordinates instead of the passthrough.