Converting to star file

I am very sorry to bring this up again. Has anything changed to the format of cs files in the last 3-4 weeks? I cannot convert them back to relion anymore.

This used to work just fine:
csparc2star.py --copy-micrograph-coordinates particles.star cryosparc_P116_J670_001_particles.cs cryosparc_P116_J670_001_particles.star

But now it prints out a bunch of errors:

/Software/linux/64bit/pyem/pyem-20210614/pyem/pyem/star.py:536: FutureWarning: Columnar iteration over characters will be deprecated in future releases.
df[UCSF.IMAGE_INDEX], df[UCSF.IMAGE_PATH] =
/Software/linux/64bit/pyem/pyem-20210614/pyem/pyem/star.py:544: FutureWarning: Columnar iteration over characters will be deprecated in future releases.
df[UCSF.IMAGE_ORIGINAL_INDEX], df[UCSF.IMAGE_ORIGINAL_PATH] =
Traceback (most recent call last):
File “/Software/linux/64bit/pyem/pyem-20210614/pyem/csparc2star.py”, line 120, in
sys.exit(main(parser.parse_args()))
File “/Software/linux/64bit/pyem/pyem-20210614/pyem/csparc2star.py”, line 71, in main
df = star.smart_merge(df, coord_star, fields=fields, key=key)
File “/Software/linux/64bit/pyem/pyem-20210614/pyem/pyem/star.py”, line 146, in smart_merge
s2 = s2.set_index(key, drop=False)
File “/Software/linux/64bit/pyem/pyem-20210614/Miniconda/envs/pyem-20210614/lib/python3.9/site-packages/pandas/core/frame.py”, line 4727, in set_index
raise KeyError(f"None of {missing} are in the columns")
KeyError: ‘None of [None] are in the columns’

omitting --copy-micrograph-coordinates switch produces a star file, but the particle mrcs file names are modified with prefixes. I will try to remove them with a script, but there must be a better way…

Peter

1 Like

Hi @peter.cherepanov,

What version of cryoSPARC (and patch) did this .cs file come from? Also, what job type did this .cs file come from?

@peter.cherepanov you probably need to include the passthrough file.

I tried several ways, with and without passthrough file, using cs files from a Refinement or Select 2D jobs. The only method that worked was to convert cs to star without any options:
csparc2star.py cryosparc_P116_J670_001_particles.cs cryosparc_P116_J670_001_particles.star

then curing mcs file names in the resulting star file with sed:
sed 's/J664\/imported\/[0-9]*_FoilHole/Extract\/job025\/Micrographs\/FoilHole/g'

then re-calling the lines using grep and copying the header from the original relion extract star file:

I don’t know what changed. I think we are using the latest version of csparc. Dashboard says “Current version: v3.2.0+210713”. Pyem is fairly recent (updated on 20210614).

Peter

Since v3.2.0+210629, particle .mrcs that are imported into cryoSPARC are prepended with the UID of their dataset file in order to keep them unique. I’d be happy to help get the --copy-micrograph-coordinates option working in pyem again.

2 Likes

@stephan I think we just need a function that checks for and removes the leading UID from the base name. If the merge_key() call at csparc2star.py:65 returns None, then it can try to remove the UIDs and run merge_key() again. The UID function can be added to pyem.metadata.

If you can write a PR for some or all of that, it would be very helpful, I’m tied up with my dissertation at the moment.

2 Likes

Is there an easy way to downgrade the patch version until there is a fix?

1 Like

@donaldb, At the moment there isn’t a way to roll back a patch except for cryosparcm update --override. This will take you back to v3.2.0.

@spunjani I thought that was probably the case. I will downgrade to fix for now.

In the future could you please carefully consider before changing the formatting of your files without warning. It makes jumping in and out of cryosparc that little bit more painful. The conversion back to RELION is very heavily used for most image processing pipelines to make optimal use of the best features both of RELION and cryosparc.

The ideal would be if you could make a homegrown .star export job which you can tweak whenever necessary.

3 Likes

Hi @donaldb,

Thanks for the feedback - noted!

1 Like

@apunjani @DanielAsarnow,

Will there be a quick fix for this problem? I got the same error:

Traceback (most recent call last):
File “/data/donghua/pyem/csparc2star.py”, line 120, in
sys.exit(main(parser.parse_args()))
File “/data/donghua/pyem/csparc2star.py”, line 71, in main
df = star.smart_merge(df, coord_star, fields=fields, key=key)
File “/data/donghua/pyem/pyem/star.py”, line 148, in smart_merge
s2 = s2.set_index(key, drop=False)
File “/data/donghua/miniconda3/envs/pyem/lib/python3.7/site-packages/pandas/core/frame.py”, line 4411, in set_index
raise KeyError(“None of {} are in the columns”.format(missing))
KeyError: ‘None of [None] are in the columns’

Hi @donghuachen, you will need to be more specific, otherwise we can’t tell what the problem is.

Hi @DanielAsarnow, I am doing the conversion from cs (cryoSPARC v3.2.0+210831) to star using your most recent pyem (csparc2star.py). Usually this would work. As some users discussed, cryoSPARC team made some changes to their software (changed the micrograph ID, etc) so the following old command could not work anymore.

~/miniconda3/envs/pyem/bin/python ~/pyem/csparc2star.py cryosparc_P3_J56_class_02_00022_particles.cs P3_J56_passthrough_particles_class_2.cs P3_J56_class_02_00022_particles.star --copy-micrograph-coordinates …/J53/particles.star

Traceback (most recent call last):
File “/data/donghua/pyem/csparc2star.py”, line 120, in
sys.exit(main(parser.parse_args()))
File “/data/donghua/pyem/csparc2star.py”, line 71, in main
df = star.smart_merge(df, coord_star, fields=fields, key=key)
File “/data/donghua/pyem/pyem/star.py”, line 148, in smart_merge
s2 = s2.set_index(key, drop=False)
File “/data/donghua/miniconda3/envs/pyem/lib/python3.7/site-packages/pandas/core/frame.py”, line 4411, in set_index
raise KeyError(“None of {} are in the columns”.format(missing))
KeyError: ‘None of [None] are in the columns’

@DanielAsarnow,
It seemed to work from several steps when I used the way you pointed out in another topic (copied as follows). Thanks!


You would need to convert the cryoSPARC CSV to a .star file first, and then use star.py --copy-micrograph-coordinates original.star fromcsparc.star withcoords.star (with correct file names of course).

@donghuachen I used the following steps but failed. Could you please help me check it? Thank you!

~/pyem/csparc2star.py cryosparc_P74_J165_004_particles.cs P74_J165_passthrough_particles.cs cryosparc_P74_J165_004_particles.star --swapxy

~/pyem/star.py --copy-micrograph-coordinates particles.star cryosparc_P74_J165_004_particles.star particles.star cryosparc_P74_J165_004_particles2.star

@nyhui,
No swapxy option is needed for the first step I think.
As discussed previously, you need to remove UIDs from the first star file you get before you use star.py. Besides, you need find the corresponding names (J1, SampelName, Extract/job025) from the first star file and particles.star for the following Step2.

  1. ~/pyem/csparc2star.py cryosparc_P74_J165_004_particles.cs P74_J165_passthrough_particles.cs cryosparc_P74_J165_004_particles.star
  2. cat cryosparc_P74_J165_004_particles.star | sed ‘s/J1/imported/[0-9]*_SampleName/Extract/job025/Micrographs/SampleName/g’ > cryosparc_P74_J165_004_particles2.star
  3. ~/pyem/star.py --copy-micrograph-coordinates particles.star cryosparc_P74_J165_004_particles2.star cryosparc_P74_J165_004_particles3.star

Hope this helps.

1 Like

@donghuachen Thank you for your help! What does UIDs mean? How can I remove them? Should I remove them after 2nd step?

@nyhui,
See the above: “Since v3.2.0+210629, particle .mrcs that are imported into cryoSPARC are prepended with the UID of their dataset file in order to keep them unique.”

The second step is to remove the UIDs. You may have different names for “J1, SampelName, Extract/job025”. You should be able to find them from the first star file and particle.star.

If you are using other older versions of cryospac, and there is NO UID in your first star file, the following command (one step) should work:
~/pyem/csparc2star.py cryosparc_P74_J165_004_particles.cs P74_J165_passthrough_particles.cs cryosparc_P74_J165_004_particles.star --copy-micrograph-coordinates particles.star

Hey @DanielAsarnow,

Thanks for the tips. It’s ready for review: https://github.com/asarnow/pyem/pull/74

1 Like