Csparc2star.py final update

Hi csparc2star.py users. I made a major change to csparc2star.py in what is very likely to be the last update given the upcoming built-in import/export tools.

The new version supports any number of .cs files as input. The syntax is csparc2star.py base.cs [aux1.cs>] ... [<auxN>.cs] output.star. All previous command line options still exist, except for --passthrough which is eliminated. Particles are merged by UID, micrographs by micrograph name (path). The first file in the input list in which a field occurs gets priority.

As an example I just created downsampled particles, and then merged the angles and shifts from a previous (non-downsampled) refinement:
csparc2star.py J85/downsampled_particles.cs J84/cryosparc_P2_J84_005_particles.cs J85/downsampled_particles.star --boxsize 216
(216 is the original box size).

The latest version is now available in the release branch. Let me know if it works / any questions.

6 Likes

Hi @DanielAsarnow,

Thanks for this update, and thank you very much for your time and energy in making csparc2star for the community!
As you mentioned, import/export utilities are in the works. We will be in touch with you if star file questions arise.

Thanks again!
Ali

2 Likes

hi, just saw release of ver 2.9.0. However still no import/export utilities, correct?

1 Like

I am getting error:
“Columns must be same length as key”

when using csparc2star.py

Can anyone help me?
Thank you

Please use the most recent version of the release branch, and follow the instructions in the first post of this thread. If it still doesn’t work, post your whole command and the stack trace from the error and we’ll figure it out.

Oh thanks so much for answering so soon.
I will try.

Hi,

This script runs for me but the output.star file doesn’t have the X, Y coordinates. I was wondering if you know where the X, Y coordinates are inputted from? like is it from the particle.cs file or the passthrough file?

Thanks!

If you picked and extracted in cryosparc, then coordinates are in the passthrough.cs or your extraction or picking .cs files.

Here’s a snip you can put in a script, say npinfo.py and use as a command line program to dump the names of the fields in a .cs file.

#!/usr/bin/env python2.7
import numpy as np
import sys
a = np.load(sys.argv[1])
print a.shape
print a.dtype.names

Naduni seems to have identified an issue… “When you are running a refinement job (from a previous refinement or previous ab initio) it is important that you use (drag and drop) the particles from a particular “class” or classes and not “all classes”. For example, drag and drop JXXX_particles_classX and not JXXX_particles_all_classes, even if you have only one class. If your refinement job was already run with the JXXX_particles_all_classes, the conversion script might not work for you (you may get this error - “Columns must be same length as key”). If so, then re-run the refinement job with JXXX_particles_classX and the resulting .cs files should work.”

1 Like

Hi Philip,
That information is out of date. You can provide any number of .cs files and csparc2star.py will match the particles based on their unique particle ID. The details are described in the first post of this thread.

@spunjani or @ali.h
Could we sticky this thread? I’ve been needing to refer people back to it frequently and it will make it easier for them to find. Thanks.

Hi Daniel,
thank you for your help. I have tried everything I could, including installing the latest pyem version and reading all instructions. However, I can’t get around this issue:
(pyem) [marino_j@merlin-l-01 J234]$ …/…/…/…/…/…/…/user/marino_j/pyem/csparc2star.py particles_selected.cs passthrough_particles_selected.cs particles_selected.star
Columns must be same length as key
A passthrough file may be required (check inside the cryoSPARC 2+ job directory)

and when I try only with one:
(pyem) [marino_j@merlin-l-01 J234]$ …/…/…/…/…/…/…/user/marino_j/pyem/csparc2star.py particles_selected.cs particles_selected.star
Defocus values not found
Angular alignment parameters not found

These are files from a selection job after 2D classification.
Any help is highly appreciated. sorry to bother !

If it is asking for a passthrough file, you don’t have the latest version of pyem I think (as that flag is gone in the latest version). What does the output of csparc2star.py -h look like? This is what I get for the latest version:

Dear Oli, thank you. I am fairly sure the version of pyem is correct. What other problem could it be ?

Colleague of @marino-j here. We’re using the master branch of pyem (04974f9). Looking at the code it looks like the message about passthrough files might be printed for lots of parse errors, so that might be a red herring?

Can we provide any other info to aid debugging?

Hmm - I tried with particles from select2d on my end and it works fine, using the same command that you used.

What workflow did you use prior to Class2D? Patch Motion, Patch CTF, etc?

Cheers
Oli

yes, patch motion and patch CTF. These are particles extracted from coordinates and accepted micrographs from different workspaces. Can this be an issue ? It should not matter, I think.
Thank you for your help

I wouldn’t have thought so, but maybe? Easy to test - do you have an example where the whole workflow is in a single workspace, and does the conversion work for that case?

Cheers
Oli

Indeed it works when the files come from a single workspace.

So a solution if you have different datasets is to create .star files from the single workspaces (e.g. from the single select2D jobs), merge them in relion, and modify the .mrc extension in the images present in the relative JX/extract folders, and modify the .mrc to .mrcs in the merged star file… correct ?