Latest instructions to convert cs file to star file

Hi - can someone comment on the latest instructions on how to use pyem to convert a cs file (extracted particles in cryosparc) to a .star file for the latest relion release.

A step-to-step procedure would be very helpful. Many thanks !

I’m sure someone will get you a step-to-step, there are some pros on here. in short, it’s pretty easy these days. download new pyem from GitHub. go to the “output” tab on cryosparc browser. there should be an option to Export files from jobs. This will write a .cs and .csg file that contains all the important information for a particle set (2D selection or 3D class or refinement etc) and includes the euler angles. mine is written to /cryosparc/PXX/groups/exports/JXXXXXX.cs or similar. csparc2star with the .cs file. The new particle.star is ready, except it points to the particles location as an old cryosparc location. Use VI or similar to redirect that star file to find your Relion particle locations. Relion will probably have to convert the header to new 3.1 format, and you may want to develop optics groups or similar, but this star file should be able to generate a similar volume with refinement-local searches only.

2 Likes

Thank you, @CryoEM1 for your help !

Another question: how important is to “dereference” a folder (exported job from an extract particle job) before uploading a particle stack to the EMPIAR ?

Many thanks !
Jacopo

@CryoEM1 sorry another question: let’s say that I do not change the .star file, so that the location of the mrc files remains as it is written in the .cs file. Then I start relion from a different folder, but I import particles that have that cryosparc location. What will happen then ? I guess the new relion files will be written in the relion folder, but it will read the particles from the cryosparc folder, correct ? thank you

I am not sure what you mean here - what files are you referring to that relion is writing? Paths in a star file are relative. If you have a cryosparc star file in the relion directory, you will need to alter the paths in the file so they resolve, or simply symlink the cryosparc job directory into the relion dir. There is one wrinkle if the extraction was performed in cryosparc, which is that relion expects all particle stacks to end in mrcs, whereas cryosparc uses mrc as an extension. You can just create renamed copies of the particle stacks to fix this (and edit the star file appropriately).

Thank you, @olibclarke. I forgot about the story of the mrcs files. This adds another layer of complication :-). Do you know of any particle stack that was uploaded on the EMPIAR as a self-contained cryosparc export job. Do you know if this is a widespread practice ? As a cryosparc user this makes a lot of sense, not sure for the rest of the community.

Hey @marino-j,

“Dereferencing” a folder is used when copying or compressing it, so that symbolic links will be replaced with the actual files. If you’re uploading a job, it’s best to dereference the symbolic links so that the original files get uploaded instead of only the symbolic links.

Hi @stephan, thank you. In my case there are no symbolic links.
I have now exported an extract job that contains a “final” set of particles. I have saved the folder containing all the mrc files of the particles, and the relative .cs and csg files.
What is the folder structure that one needs to keep in order that another user, say after downloading from the EMPIAR, can import the particles through the .csg file directly ? Sorry if I have missed this in any part of the data management tutorial.

thank you for your help !

Hi @marino-j,

You can either package up an entire job or just the particles themselves (a single result group).
If you package up the job, it’ll allow users to have the exact job that you have completed in their cryoSPARC instance (by importing the job). For a refinement job, this will include the volume, mask and particles processed.
To do this, press the “Export” button on the job details panel. This will move the entire job’s contents into a single folder in the project (all thumbnails, symlinks to files, manifest files).

If you package up the particles only, downstream users will be able to use the “Import Result Group” job to import only the particles into their instance, which will come with CTF and alignments data.
To do this, press the “Export” button in the Outputs tab under the particles output group. This will put together a folder of exactly what one needs to re-import the particles into their instance.

At this point, you can upload a tar file of the job (specifying the dereference argument to the tar command) or, upload the contents directly (you might have to use the cp command to dereference the symbolic links first).

2 Likes

@stephan Hi - now I understand the difference between the export in the job tabs, and the export in the output tab. The first creates a folder in export/groups and this can be used by another user on the same cluster for example to import that output, while the latter creates a self-contained folder in export/jobs. Is this difference correct ?

So I did export of the particle output tab, and indeed now there is a folder with all the files. I will use this folder for the upload on the EMPIAR eventually.

Many thanks !
Jacopo

@DanielAsarnow @olibclarke @CryoEM1
Hey guys, I was wondering whether you could put me in the right direction for this specific case: I have been asked to try to retrieve information from a flexible region by doing 3D classification without alignment in Relion. Because I obtained my volume in cryoSPARC, from where would you start in Relion ? Because my ab-initio already shows that that part is flexible, would it make sense to start in Relion from the particle stack, and attempt a new ab-initio there ? My guess is that the ab-inition in Relion will be at best similar, but not better. Any input is greatly appreciated !! Many thanks

Hi Jacopo,

If you have run refinement in cryosparc, why wouldn’t you start from the particle.cs file that is generated from that? I’m not sure why you’d run another ab initio if you already have good orientations from refinement in cryosparc - just take the output particle cs file and volume and you should be good to go

Cheers
Oli

I almost always take the 3D ab initio volume from cryosparc as a starting point in relion. But I agree with Oli, if you have a good structure in cryosparc then your outputs will contain angles which are good enough for Relion. I would perform a 3D refinement with local search only in Relion (set the angular search to 1.8 and the local search to 1.8) to perfect Relion’s assignments, then no-alignment 3D class. But you could certainly skip the 3D refine and use exactly cryosparc priors. Try both :wink:

1 Like

@DanielAsarnow I have installed this morning the latest version of pyem, to convert a .cs file that comes from an extract job, so that I can import the particle stack later in Relion.
The folder architecture from the exported cryosparc extract job looks like this:


J1162 contains the particle stack, while the other four folders contain micrographs.

The error is this one:
pyem) marino_j@jm-t480:~$ pyem/csparc2star.py Desktop/files\ from\ cryosparc/P1_J1074_particles_exported.cs P1_J1074_particles_exported.star
/home/marino_j/pyem/pyem/metadata.py:334: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify ‘dtype=object’ when creating the ndarray.
df[model[k]] = pd.DataFrame(np.array(
Columns must be same length as key
Traceback (most recent call last):
File “/home/marino_j/pyem/csparc2star.py”, line 42, in main
df = metadata.parse_cryosparc_2_cs(cs, passthroughs=args.input[1:], minphic=args.minphic,
File “/home/marino_j/pyem/pyem/metadata.py”, line 415, in parse_cryosparc_2_cs
df = cryosparc_2_cs_model_parameters(cs, df, minphic=minphic)
File “/home/marino_j/pyem/pyem/metadata.py”, line 334, in cryosparc_2_cs_model_parameters
df[model[k]] = pd.DataFrame(np.array(
File “/home/marino_j/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/frame.py”, line 3597, in setitem
self._setitem_array(key, value)
File “/home/marino_j/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/frame.py”, line 3634, in _setitem_array
check_key_length(self.columns, key, value)
File “/home/marino_j/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/indexers.py”, line 428, in check_key_length
raise ValueError(“Columns must be same length as key”)
ValueError: Columns must be same length as key
Required fields could not be mapped. Are you using the right input file(s)?

Another user already reported this problem, which was corrected by not including the passthrough option, which I do not have. I also don’t get with what other file it is comparing the difference in columns ?
Other info: the .cs file is generated by cryosparc v 2.13.2, by using the export button in the output tab of the J1162 extracted job.
I have tried pyem with another .cs file I had, from a NU-refinement job, and it worked immediately. I have also re-exported that job, to generate a different .cs file, but got into the same problem.

If you have any suggestion, it’s highly appreciated !

many thanks !
Jacopo

@DanielAsarnow sorry to ask again. Is this a known problem from the latest version of pyem ? I tried with another .cs file that comes from the same export job, P1_XXX_micrographs_exported.cs, and it worked. I can’t get why with the particles.cs files it does not work. Does it make sense that I try with a previous version ? Thanks a lot !!

@olibclarke sorry, oli. When you talk about the particle cs file, is it this one “particles.alignments.3D”

or this one “P1_J1074_particles_exported.cs” that I obtain after I export the entire job, and it’s located in the /exports folder inside the project folder ?

I’m asking because I can’t convert to a star file the particles_exported.cs file, while the .cs file that I download from the alignments 3D (see picture above) I can convert it.
Many thanks for your kind help, I know you find these questions silly !!

Hi Jacopo,

Not silly, it can be confusing! Have a look on the command line in the cryosparc job dir. You should see (in addition to maps ad other things) particle.cs files for each iteration. The one corresponding to the last iteration of the refinement (as highlighted in this screenshot) is the one you want for this purpose. Hope that helps!

Cheers
Oli

Yes, so download the .cs file in the output tab of a refinement job, do pyem, and import in relion. That works. I still do not fully understand why any cs file with the extension _exported.cs can’t be converted by pyem. At the moment I don’t need to solve this, it’s more a curiosity. Thanks again ! Jacopo