Csparc2star.py rlnMicrographName error

Hi,

we have an issue with csparc2star.py. We extracted particles in Relion, then imported them in Cryosparc where we did 2D classification. We would like to convert selected particles back into .star file, but we get the an error (copied below).
We have the latest pyem installed.

csparc2star.py particles_selected.cs csparc.star
Defocus values not found
Traceback (most recent call last):
  File "/media/Data/software/pyem/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rlnMicrographName'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/Data/software/pyem/pyem/csparc2star.py", line 105, in <module>
    sys.exit(main(parser.parse_args()))
  File "/media/Data/software/pyem/pyem/csparc2star.py", line 79, in main
    star.write_star(args.output, df, resort_records=True)
  File "/media/Data/software/pyem/pyem/pyem/star.py", line 288, in write_star
    df = sort_records(df, inplace=True)
  File "/media/Data/software/pyem/pyem/pyem/star.py", line 420, in sort_records
    df = natsort_values(df, Relion.MICROGRAPH_NAME, inplace=True)
  File "/media/Data/software/pyem/pyem/pyem/util/util.py", line 132, in natsort_values
    idx = np.array(natsort.index_natsorted(df[col]))
  File "/media/Data/software/pyem/lib/python3.7/site-packages/pandas/core/frame.py", line 2995, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/media/Data/software/pyem/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2899, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rlnMicrographName'

Any ideas what could be wrong?
Thank you!

Seems like the same error as here.

The 2D classification files must drop the micrograph names. You should be able to add a .cs file from earlier, say your extraction job, that definitely has the names. I’ll be double checking this later, if it’s a missed field on my end.

Also, there was a bugfix last night so the error won’t occur, however the output .star file will be missing the micrograph names. Depending on what you needed to do, this might be OK.

Hi Daniel,

Yes it looks the same error, sorry I didn’t notice it and thanks for sharing.

But I still cannot resolve the error in my case.
As you suggested in the other topic, I used the flag --loglevel debug, the error is copied below.

  1. You suggest to use .cs file from a step before. In our case, we did particle extraction in Relion, so I used imported_particles.cs file. I get the same error as when using particles_selected.cs after 2D classification, therefore I assume that micrograph names are already missing in the first .cs?

csparc2star.py imported_particles.cs csparc.star --loglevel debug
Detected CryoSPARC 2+ .cs file
Reading primary file
Classification parameters not found
Directly copied fields: rlnVoltage, rlnAmplitudeContrast, ucsfImageIndex, rlnSphericalAberration, rlnPhaseShift, ucsfImagePath, ucsfUid, rlnDetectorPixelSize, rlnDefocusAngle, rlnDefocusU, rlnDefocusV, rlnMagnification
Converting DEFOCUSANGLE from degrees to radians
Converting PHASESHIFT from degrees to radians
Traceback (most recent call last):
File “/media/Data/software/pyem/lib/python3.7/site-packages/pandas/core/indexes/base.py”, line 2897, in get_loc
return self._engine.get_loc(key)
File “pandas/_libs/index.pyx”, line 107, in pandas._libs.index.IndexEngine.get_loc
File “pandas/_libs/index.pyx”, line 131, in pandas._libs.index.IndexEngine.get_loc
File “pandas/_libs/hashtable_class_helper.pxi”, line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File “pandas/_libs/hashtable_class_helper.pxi”, line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ‘rlnMicrographName’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/media/Data/software/pyem/pyem/csparc2star.py”, line 105, in
sys.exit(main(parser.parse_args()))
File “/media/Data/software/pyem/pyem/csparc2star.py”, line 79, in main
star.write_star(args.output, df, resort_records=True)
File “/media/Data/software/pyem/pyem/pyem/star.py”, line 288, in write_star
df = sort_records(df, inplace=True)
File “/media/Data/software/pyem/pyem/pyem/star.py”, line 420, in sort_records
df = natsort_values(df, Relion.MICROGRAPH_NAME, inplace=True)
File “/media/Data/software/pyem/pyem/pyem/util/util.py”, line 132, in natsort_values
idx = np.array(natsort.index_natsorted(df[col]))
File “/media/Data/software/pyem/lib/python3.7/site-packages/pandas/core/frame.py”, line 2995, in getitem
indexer = self.columns.get_loc(key)
File “/media/Data/software/pyem/lib/python3.7/site-packages/pandas/core/indexes/base.py”, line 2899, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File “pandas/_libs/index.pyx”, line 107, in pandas._libs.index.IndexEngine.get_loc
File “pandas/_libs/index.pyx”, line 131, in pandas._libs.index.IndexEngine.get_loc
File “pandas/_libs/hashtable_class_helper.pxi”, line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File “pandas/_libs/hashtable_class_helper.pxi”, line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ‘rlnMicrographName’

  1. I also tried the command with --copy-micrograph-coordinates flag together with imported particles.star file and this is the error in this case:

csparc2star.py --copy-micrograph-coordinates particles.star particles_selected.cs csparc.star --loglevel debug
Detected CryoSPARC 2+ .cs file
Reading primary file
Classification parameters not found
Directly copied fields: rlnDetectorPixelSize, ucsfImagePath, ucsfImageIndex, ucsfUid, rlnMagnification
Defocus values not found
Coordinates merge key: None
Traceback (most recent call last):
File “/media/Data/software/pyem/pyem/csparc2star.py”, line 105, in
sys.exit(main(parser.parse_args()))
File “/media/Data/software/pyem/pyem/csparc2star.py”, line 68, in main
df = star.smart_merge(df, coord_star, fields=fields, key=key)
File “/media/Data/software/pyem/pyem/pyem/star.py”, line 101, in smart_merge
s2 = s2.set_index(key, drop=False)
File “/media/Data/software/pyem/lib/python3.7/site-packages/pandas/core/frame.py”, line 4411, in set_index
raise KeyError(“None of {} are in the columns”.format(missing))
KeyError: ‘None of [None] are in the columns’

Thanks for the help!

This thread has a more searchable name, so I’m posting the answer here.

TL;DR
This issue, originating from imported particles using symbolic links in the job directory, has been resolved. The command:
csparc2star.py imported_particles.cs imported_particles.star --copy-micrograph-coordinates particles.star
will now behave as expected.

Full explanation
When you import particles to cryoSPARC, the stacks are symbolically linked in the job imported directory. Therefore, when converting these particles back to .star, the rlnImageName field points to those links, and not the original paths in the imported .star file. Merging of the original and converted files is based on matching rlnImageName, leading to the error. (Note also that a copy of the imported file named particles.star is kept in the import job, this can come in handy).

The solution is to match based on the basename of the stacks instead of the entire path. I actually implemented this a long time ago, but only via star.py --copy-micrograph-coordinates and only using the --augment argument to add additional fields to the parsed .star file. I took this conservative approach in case people had multiple stacks with the same names but in different directories. I’m switching this in favor of a more useful default for most (non-insane) people. Someone in that situation will now have to add --noaugment. This can additionally affect advanced merging using star.py --merge-key <a,b,c> --merge-fields <x,y,z> --merge-star <ptcls.star> type commands, but I assume folks using this feature can figure it out, if they really need full path merging at all times.

1 Like

Hi Daniel,
Thank you for addressing this issue. I am new to this and I am still having a problem- I am able to convert the passthrough file to .star using csparc2star.py, but I’m not able to get the --copy-micrograph-coordinates comand to work. Please let me know if you have any suggestions for this.
Best,
Amanda

(base) [drennan@ad.wisc.edu@BIOCWK-01037L pyem]$ ./csparc2star.py …/cryosparc191122/P5/J202/passthrough_particles_selected.cs …/cryosparc191122/P5/J202/passthrough_particles_selected.star --copy-micrograph-coordinates tilt0c.star
Traceback (most recent call last):
File “./csparc2star.py”, line 105, in
sys.exit(main(parser.parse_args()))
File “./csparc2star.py”, line 61, in main
glob(args.copy_micrograph_coordinates)), join=“inner”)
File “/home/drennan@ad.wisc.edu/miniconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py”, line 255, in concat
sort=sort,
File “/home/drennan@ad.wisc.edu/miniconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py”, line 304, in init
raise ValueError(“No objects to concatenate”)
ValueError: No objects to concatenate

Did your actual command also have paths starting with three dots? I don’t think that’s a legal path in bash - maybe that is the problem. Also, there’s no problem with just converting a passthrough file, but you probably want to use the job’s particles .cs file and the passthrough file together:

csparc2star.py particles.cs passthrough_particles.cs output.star --copy-micrograph-coordinates original.star

Thanks for getting back to me-
I did not have 3 dots in my path, only 2… I’m not sure why it copied that way. I put the files together as you suggested but I’m still getting the same error… I will let you know if I figure this out.

@acd I get this error if the --copy-micrograph-coordinates file doesn’t exist - can you double check the path to your tilt0c.star from where you are running the command? If the path is right, can you also double check that file is formatted normally and has particles in it?

Hi Daniel,

I am getting more or the less the same error when I tried to change the .cs file to star file after the 2D classification in cryosparc.

The Idea is to generate initial 3D model in the relion; but I keep getting this error

./csparc2star.py cryosparc_P5_J15_020_particles.cs J15_020_particles.star --copy-micrograph-coordinates particles.star --loglevel debug
Detected CryoSPARC 2+ .cs file
Reading primary file
Assigning parameters 2D classes or single 3D class
Assigning skew angle from 2D classification
Directly copied fields: rlnDefocusU, rlnCtfBfactor, rlnDefocusAngle, ucsfImageIndex, rlnVoltage, rlnAmplitudeContrast, ucsfImagePath, rlnDefocusV, rlnSphericalAberration, rlnOpticsGroup, rlnDetectorPixelSize, rlnPhaseShift, ucsfUid, rlnRandomSubset, rlnOriginX, rlnOriginY, rlnAnglePsi, rlnClassNumber, rlnMagnification
Converting DEFOCUSANGLE from degrees to radians
Converting PHASESHIFT from degrees to radians
Changing RANDOMSUBSET to 1-based index
Changing CLASS to 1-based index
Converting ANGLEPSI from degrees to radians
Traceback (most recent call last):
File “/home/takagilab/pyem/./csparc2star.py”, line 110, in
sys.exit(main(parser.parse_args()))
File “/home/takagilab/pyem/./csparc2star.py”, line 62, in main
glob(args.copy_micrograph_coordinates)), join=“inner”)
File “/home/takagilab/miniconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py”, line 255, in concat
sort=sort,
File “/home/takagilab/miniconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py”, line 304, in init
raise ValueError(“No objects to concatenate”)
ValueError: No objects to concatenate

Thank you,
E.R

@eswar909 Can you double check that the particles.star is reachable by that path from your working directory? Thanks.

Hi Daniel,
Thanks for your reply,
Yeah its a path issue; I sorted it out and working now . I have another question, After converting the file to star file; by following the command
–copy-micrograph-coordinates par4relion.star particles_selected.cs passthrough_particles_selected.cs J16Par4.star --loglevel debug

( I am trying to converted .cs to star file using the output of selected 2D class job)

I used output and tried to generate 3D model in relion; but it is giving me an error

WARNING: input particles STAR file does not have a column for image dimensionality, assuming 2D images …
in: /home/software/cryoem/relion/src/jaz/obs_model.cpp, line 688
ERROR:
ObservationModel::getBoxSize: box sizes not available. Make sure particle images are available before converting/importing STAR files from earlier versions of RELION.

What might went wrong.

Best,
E.R

This may not matter here, but the positional arguments (the ones without a --, i.e. the input and output files) should be either at the end or the beginning of the argument list. The argument parser can get confused if there are --whatever arguments on either side.

It sounds like you are using Relion 3.1, in which case you might need to import the .star file first. I haven’t yet added all of the new fields such as the image size, to make a natively compatible Relion 3.1 file.