I am very sorry to bring this up again. Has anything changed to the format of cs files in the last 3-4 weeks? I cannot convert them back to relion anymore.
This used to work just fine:
csparc2star.py --copy-micrograph-coordinates particles.star cryosparc_P116_J670_001_particles.cs cryosparc_P116_J670_001_particles.star
But now it prints out a bunch of errors:
/Software/linux/64bit/pyem/pyem-20210614/pyem/pyem/star.py:536: FutureWarning: Columnar iteration over characters will be deprecated in future releases.
df[UCSF.IMAGE_INDEX], df[UCSF.IMAGE_PATH] =
/Software/linux/64bit/pyem/pyem-20210614/pyem/pyem/star.py:544: FutureWarning: Columnar iteration over characters will be deprecated in future releases.
df[UCSF.IMAGE_ORIGINAL_INDEX], df[UCSF.IMAGE_ORIGINAL_PATH] =
Traceback (most recent call last):
File “/Software/linux/64bit/pyem/pyem-20210614/pyem/csparc2star.py”, line 120, in
sys.exit(main(parser.parse_args()))
File “/Software/linux/64bit/pyem/pyem-20210614/pyem/csparc2star.py”, line 71, in main
df = star.smart_merge(df, coord_star, fields=fields, key=key)
File “/Software/linux/64bit/pyem/pyem-20210614/pyem/pyem/star.py”, line 146, in smart_merge
s2 = s2.set_index(key, drop=False)
File “/Software/linux/64bit/pyem/pyem-20210614/Miniconda/envs/pyem-20210614/lib/python3.9/site-packages/pandas/core/frame.py”, line 4727, in set_index
raise KeyError(f"None of {missing} are in the columns")
KeyError: ‘None of [None] are in the columns’
omitting --copy-micrograph-coordinates switch produces a star file, but the particle mrcs file names are modified with prefixes. I will try to remove them with a script, but there must be a better way…
I tried several ways, with and without passthrough file, using cs files from a Refinement or Select 2D jobs. The only method that worked was to convert cs to star without any options:
csparc2star.py cryosparc_P116_J670_001_particles.cs cryosparc_P116_J670_001_particles.star
then curing mcs file names in the resulting star file with sed: sed 's/J664\/imported\/[0-9]*_FoilHole/Extract\/job025\/Micrographs\/FoilHole/g'
then re-calling the lines using grep and copying the header from the original relion extract star file:
I don’t know what changed. I think we are using the latest version of csparc. Dashboard says “Current version: v3.2.0+210713”. Pyem is fairly recent (updated on 20210614).
Since v3.2.0+210629, particle .mrcs that are imported into cryoSPARC are prepended with the UID of their dataset file in order to keep them unique. I’d be happy to help get the --copy-micrograph-coordinates option working in pyem again.
@stephan I think we just need a function that checks for and removes the leading UID from the base name. If the merge_key() call at csparc2star.py:65 returns None, then it can try to remove the UIDs and run merge_key() again. The UID function can be added to pyem.metadata.
If you can write a PR for some or all of that, it would be very helpful, I’m tied up with my dissertation at the moment.
@spunjani I thought that was probably the case. I will downgrade to fix for now.
In the future could you please carefully consider before changing the formatting of your files without warning. It makes jumping in and out of cryosparc that little bit more painful. The conversion back to RELION is very heavily used for most image processing pipelines to make optimal use of the best features both of RELION and cryosparc.
The ideal would be if you could make a homegrown .star export job which you can tweak whenever necessary.
Will there be a quick fix for this problem? I got the same error:
Traceback (most recent call last):
File “/data/donghua/pyem/csparc2star.py”, line 120, in
sys.exit(main(parser.parse_args()))
File “/data/donghua/pyem/csparc2star.py”, line 71, in main
df = star.smart_merge(df, coord_star, fields=fields, key=key)
File “/data/donghua/pyem/pyem/star.py”, line 148, in smart_merge
s2 = s2.set_index(key, drop=False)
File “/data/donghua/miniconda3/envs/pyem/lib/python3.7/site-packages/pandas/core/frame.py”, line 4411, in set_index
raise KeyError(“None of {} are in the columns”.format(missing))
KeyError: ‘None of [None] are in the columns’
Hi @DanielAsarnow, I am doing the conversion from cs (cryoSPARC v3.2.0+210831) to star using your most recent pyem (csparc2star.py). Usually this would work. As some users discussed, cryoSPARC team made some changes to their software (changed the micrograph ID, etc) so the following old command could not work anymore.
Traceback (most recent call last):
File “/data/donghua/pyem/csparc2star.py”, line 120, in
sys.exit(main(parser.parse_args()))
File “/data/donghua/pyem/csparc2star.py”, line 71, in main
df = star.smart_merge(df, coord_star, fields=fields, key=key)
File “/data/donghua/pyem/pyem/star.py”, line 148, in smart_merge
s2 = s2.set_index(key, drop=False)
File “/data/donghua/miniconda3/envs/pyem/lib/python3.7/site-packages/pandas/core/frame.py”, line 4411, in set_index
raise KeyError(“None of {} are in the columns”.format(missing))
KeyError: ‘None of [None] are in the columns’
@DanielAsarnow,
It seemed to work from several steps when I used the way you pointed out in another topic (copied as follows). Thanks!
You would need to convert the cryoSPARC CSV to a .star file first, and then use star.py --copy-micrograph-coordinates original.star fromcsparc.star withcoords.star (with correct file names of course).
@nyhui,
No swapxy option is needed for the first step I think.
As discussed previously, you need to remove UIDs from the first star file you get before you use star.py. Besides, you need find the corresponding names (J1, SampelName, Extract/job025) from the first star file and particles.star for the following Step2.
@nyhui,
See the above: “Since v3.2.0+210629, particle .mrcs that are imported into cryoSPARC are prepended with the UID of their dataset file in order to keep them unique.”
The second step is to remove the UIDs. You may have different names for “J1, SampelName, Extract/job025”. You should be able to find them from the first star file and particle.star.
If you are using other older versions of cryospac, and there is NO UID in your first star file, the following command (one step) should work:
~/pyem/csparc2star.py cryosparc_P74_J165_004_particles.cs P74_J165_passthrough_particles.cs cryosparc_P74_J165_004_particles.star --copy-micrograph-coordinates particles.star
Hi,
I am facing an issue while running the CTF refinement job.I have imported particles from cryosparc with using copy_micrograph.But I have done the 3D classification for the particles and now want to run the CTF refinement. Can someone suggest me how can I add micrograph column to my run_data.star file.This is not the whole set of particles which i have imported from cryosparc. This is just the subset of that particles.