Experimental support for Relion's Bayesian polishing in csparc2star.py

With csparc2star.py you can now run Bayesian Polishing in Relion without having to repeat motion correction. Here are the instructions.

I believe it is working, but it has not been tested extensively. Enjoy!

14 Likes

Hi Daniel,

This is great, thank you for doing this! I’ll update with some feedback for how it is working for us by next week.

Jonathan.

Hi Daniel,

We get the error below when we run this.

It does write out all of the trajectories into separate star files but it seems like where it fails is to write the corrected_micrographs.star file as this file does not get written.

Traceback (most recent call last):
File “/opt/cryoem/pyem/csparc2star.py”, line 170, in
sys.exit(main(parser.parse_args()))
File “/opt/cryoem/pyem/csparc2star.py”, line 71, in main
star.write_star(mic_star, data_general[[f for f in fields if f in data_general]])
File “/opt/cryoem/pyem/pyem/star.py”, line 546, in write_star
df_optics = gb[df.columns.intersection(Relion.OPTICSGROUPTABLE)].first().reset_index(drop=False)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/groupby.py”, line 1579, in first
return self._agg_general(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/groupby.py”, line 999, in _agg_general
return self._cython_agg_general(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1019, in _cython_agg_general
agg_blocks, agg_items = self._cython_agg_blocks(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1030, in _cython_agg_blocks
data: BlockManager = self._get_data_to_aggregate()
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1695, in _get_data_to_aggregate
obj = self._obj_with_exclusions
File “pandas/_libs/properties.pyx”, line 33, in pandas._libs.properties.CachedProperty.get
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/base.py”, line 204, in _obj_with_exclusions
return self.obj.reindex(columns=self._selection_list)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/util/_decorators.py”, line 309, in wrapper
return func(*args, **kwargs)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 4031, in reindex
return super().reindex(**kwargs)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/generic.py”, line 4458, in reindex
return self._reindex_axes(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3871, in _reindex_axes
frame = frame._reindex_columns(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3916, in _reindex_columns
return self._reindex_with_indexers(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/generic.py”, line 4521, in _reindex_with_indexers
new_data = new_data.reindex_indexer(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/internals/managers.py”, line 1276, in reindex_indexer
self.axes[axis]._can_reindex(indexer)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/indexes/base.py”, line 3285, in _can_reindex
raise ValueError(“cannot reindex from a duplicate axis”)
ValueError: cannot reindex from a duplicate axis

Ok, we got it to work with mcstar.py

Are you using the latest one? You shouldn’t have to use mcstar.py anymore (and it’s better not to because more complete information is available when the program has the original CS file open). Can you also use --loglevel=debug and post the command? There are additional debug logging statements that will help pinpoint the problem.

Thanks for testing!!

PS I updated the instructions for the latest version, when I pushed it last week, but I thought I had made those changes before you would have had time to clone / read.

Hi Daniel,

I am trying to test this with Falcon3 dataset collected in .mrc files, which are already gain-corrected. However, I get error below. Do you know what may be going on? I tried to create a gain reference using relion_estimate_gain but still no luck.

Also, will this still work if micrographs are zipped to .bz2?

Thank you,

Traceback (most recent call last):
File “/opt/apps/PyEM/20221121/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3803, in get_loc
return self._engine.get_loc(casted_key)
File “pandas/_libs/index.pyx”, line 138, in pandas._libs.index.IndexEngine.get_loc
File “pandas/_libs/index.pyx”, line 165, in pandas._libs.index.IndexEngine.get_loc
File “pandas/_libs/hashtable_class_helper.pxi”, line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item
File “pandas/_libs/hashtable_class_helper.pxi”, line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ‘rlnMicrographGainName’

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “/opt/apps/PyEM/20221121/pyem/csparc2star.py”, line 170, in
sys.exit(main(parser.parse_args()))
File “/opt/apps/PyEM/20221121/pyem/csparc2star.py”, line 59, in main
data_general = metadata.cryosparc_2_cs_movie_parameters(cs, passthrough=pt, trajdir=trajdir, path=args.micrograph_path)
File “/opt/apps/PyEM/20221121/pyem/pyem/metadata/cryosparc2.py”, line 237, in cryosparc_2_cs_movie_parameters
data_general[star.Relion.MICROGRAPHGAIN_NAME] = data_general[star.Relion.MICROGRAPHGAIN_NAME].apply(
File “/opt/apps/PyEM/20221121/lib/python3.9/site-packages/pandas/core/frame.py”, line 3804, in getitem
indexer = self.columns.get_loc(key)
File “/opt/apps/PyEM/20221121/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3805, in get_loc
raise KeyError(key) from err
KeyError: ‘rlnMicrographGainName’

Hi Daniel,

We cloned it on Nov 23 so it should have all the latest changes I think.

With debugging on I get many lines printed for each metadata star file. The last few lines are the ones I think you would want:

S12/motioncorrected/stack_07738_X-1Y+1-3_traj.npy: 0-45, (45 x 2)

Writing MovieExport/Movies/stack_07738_X-1Y+1-3_patch_aligned_doseweighted.star
Traceback (most recent call last):
File “/opt/cryoem/pyem/csparc2star.py”, line 170, in
sys.exit(main(parser.parse_args()))
File “/opt/cryoem/pyem/csparc2star.py”, line 71, in main
star.write_star(mic_star, data_general[[f for f in fields if f in data_general]])
File “/opt/cryoem/pyem/pyem/star.py”, line 546, in write_star
df_optics = gb[df.columns.intersection(Relion.OPTICSGROUPTABLE)].first().reset_index(drop=False)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/groupby.py”, line 1579, in first
return self._agg_general(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/groupby.py”, line 999, in _agg_general
return self._cython_agg_general(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1019, in _cython_agg_general
agg_blocks, agg_items = self._cython_agg_blocks(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1030, in _cython_agg_blocks
data: BlockManager = self._get_data_to_aggregate()
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1695, in _get_data_to_aggregate
obj = self._obj_with_exclusions
File “pandas/_libs/properties.pyx”, line 33, in pandas._libs.properties.CachedProperty.get
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/base.py”, line 204, in _obj_with_exclusions
return self.obj.reindex(columns=self._selection_list)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/util/_decorators.py”, line 309, in wrapper
return func(*args, **kwargs)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 4031, in reindex
return super().reindex(**kwargs)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/generic.py”, line 4458, in reindex
return self._reindex_axes(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3871, in _reindex_axes
frame = frame._reindex_columns(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3916, in _reindex_columns
return self._reindex_with_indexers(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/generic.py”, line 4521, in _reindex_with_indexers
new_data = new_data.reindex_indexer(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/internals/managers.py”, line 1276, in reindex_indexer
self.axes[axis]._can_reindex(indexer)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/indexes/base.py”, line 3285, in _can_reindex
raise ValueError(“cannot reindex from a duplicate axis”)
ValueError: cannot reindex from a duplicate axis

Sounds extremely useful so thank you!

Running into this error trying to run csparc2star on Pxx_Jxx_passthrough_micrographs.cs and micrographs_rigid_aligned.cs:

Traceback (most recent call last):
  File "/xxx/software/pyem/csparc2star.py", line 170, in <module>
    sys.exit(main(parser.parse_args()))
  File "/xxx/software/pyem/csparc2star.py", line 62, in main
    for mic in metadata.cryosparc_2_cs_motion_parameters(cs, data_general, trajdir=trajdir):
  File "/xxx/software/pyem/pyem/metadata/cryosparc2.py", line 248, in cryosparc_2_cs_motion_parameters
    trajfile = cs['rigid_motion/path'][i].decode('UTF-8')
ValueError: no field of name rigid_motion/path

Perhaps I am not using the correct inputs but these seem to be the only .cs files in the cryoSPARC motion corr directory

Thank you all again for testing!

@joonpark - my program is only going to look at metadata, not the actual data, so it doesn’t matter what format the micrographs or movies are in. When you actually run polishing the paths will have to be correct and the data in a Relion-compatible format though. I’m adding some code now so that there won’t be an error (just a warning) if there is no gain reference available.

@jcoleman - sorry for the log spam, you can also use --loglevel=info which is less verbose but prints more than the default (warning). The exception “cannot reindex from a duplicate output” is surprising to me, would you mind sending me your .cs file?

@maxm - The rigid motion should be in the micrographs_rigid_aligned.cs file, can you try swapping the order of the .cs files in the command? This shouldn’t be necessary but there may be a bug with merging multiple .cs files down the --movies code path.

4 Likes

Hi Daniel,

Thanks for that! Good news is that switching the input order works for finding the rigid motion, bad news is there’s now a new error regarding I think the dose information:

Traceback (most recent call last):
  File "/xxx/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rlnImageSizeZ'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/xxx/pyem/csparc2star.py", line 170, in <module>
    sys.exit(main(parser.parse_args()))
  File "/xxx/pyem/csparc2star.py", line 59, in main
    data_general = metadata.cryosparc_2_cs_movie_parameters(cs, passthrough=pt, trajdir=trajdir, path=args.micrograph_path)
  File "/xxx/pyem/pyem/metadata/cryosparc2.py", line 222, in cryosparc_2_cs_movie_parameters
    data_general[star.Relion.MICROGRAPHDOSERATE] /= data_general[star.Relion.IMAGESIZEZ]
  File "/xxx/software/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/frame.py", line 3458, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/xxx/software/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
    raise KeyError(key) from err
KeyError: 'rlnImageSizeZ'

Thanks!

Note: Other members of the lab have encountered the same issue

Just a note for everyone helping with testing, can you please use --loglevel=info and post the command line and the cryosparc job type and if patch motion was live or offline cryosparc? Thanks!

@maxm the missing value is the Z size of the movie, which is the number of frames. That should be there if the movie size was in the .cs files passed - you might need another .cs file (“passthrough” but maybe not literally a passthrough_*.cs file) that would have the raw movie information. The code should be more robust around this and give an earlier error.

Another note: in the instructions on my wiki I put --inverty for the particle conversion step because I was sure that was right for patch motion → outside motion micrographs but in my test data that was wrong and it worked correctly w/out the argument. So I’m not sure when we’ll need it, you all should double check with disparticle.py or by polishing a subset and doing 2D classification.

(Also worth noting Takanori says you should always do a fresh 2DCA run after polishing due to reincorporation of particles with masked hot pixels).

Hi Daniel,

Sure I can send you those files via PM. Our patch motion job came from a live session. If I run with loglevel=info this is the output:

/opt/cryoem/pyem/csparc2star.py --loglevel=info --movies P96_S12_accepted_live_exposures.cs P96_J2951_passthrough_micrographs.cs s12_movies/corrected_micrographs.star s12_movies/motioncorrected/
Writing per-movie star files into s12_movies/motioncorrected/
Creating movie data_general tables
Copying movie size
Reading movie trajectory files
Traceback (most recent call last):
File “/opt/cryoem/pyem/csparc2star.py”, line 170, in
sys.exit(main(parser.parse_args()))
File “/opt/cryoem/pyem/csparc2star.py”, line 71, in main
star.write_star(mic_star, data_general[[f for f in fields if f in data_general]])
File “/opt/cryoem/pyem/pyem/star.py”, line 546, in write_star
df_optics = gb[df.columns.intersection(Relion.OPTICSGROUPTABLE)].first().reset_index(drop=False)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/groupby.py”, line 1579, in first
return self._agg_general(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/groupby.py”, line 999, in _agg_general
return self._cython_agg_general(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1019, in _cython_agg_general
agg_blocks, agg_items = self._cython_agg_blocks(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1030, in _cython_agg_blocks
data: BlockManager = self._get_data_to_aggregate()
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/groupby/generic.py”, line 1695, in _get_data_to_aggregate
obj = self._obj_with_exclusions
File “pandas/_libs/properties.pyx”, line 33, in pandas._libs.properties.CachedProperty.get
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/base.py”, line 204, in _obj_with_exclusions
return self.obj.reindex(columns=self._selection_list)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/util/_decorators.py”, line 309, in wrapper
return func(*args, **kwargs)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 4031, in reindex
return super().reindex(**kwargs)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/generic.py”, line 4458, in reindex
return self._reindex_axes(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3871, in _reindex_axes
frame = frame._reindex_columns(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3916, in _reindex_columns
return self._reindex_with_indexers(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/generic.py”, line 4521, in _reindex_with_indexers
new_data = new_data.reindex_indexer(
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/internals/managers.py”, line 1276, in reindex_indexer
self.axes[axis]._can_reindex(indexer)
File “/opt/local/miniconda2/envs/pyem/lib/python3.8/site-packages/pandas/core/indexes/base.py”, line 3285, in _can_reindex
raise ValueError(“cannot reindex from a duplicate axis”)
ValueError: cannot reindex from a duplicate axis

Hi Daniel,

I’ve been looking at all the availible .cs files in the MotionCorr and movie import jobs with the cryosparc icli to see the available fields. I’m not sure any of these contain the Z size unless this is stored in ‘shape’.

Either way I attach these incase you spot something I don’t…

From the motion corr job:

micrographs_rigid_aligned.cs

                         uid   <u8
                    micrograph_blob/path    |O
                     micrograph_blob/idx   <u4
                   micrograph_blob/shape   <u4       (2,)
                 micrograph_blob/psize_A   <f4
                  micrograph_blob/format    |O
micrograph_blob/is_background_subtracted   <u4
                    micrograph_blob/vmin   <f4
                    micrograph_blob/vmax   <f4
              micrograph_blob/import_sig   <u8
             micrograph_blob_non_dw/path    |O
              micrograph_blob_non_dw/idx   <u4
            micrograph_blob_non_dw/shape   <u4       (2,)
          micrograph_blob_non_dw/psize_A   <f4
           micrograph_blob_non_dw/format    |O
micrograph_blob_non_dw/is_background_subtracted   <u4
             micrograph_blob_non_dw/vmin   <f4
             micrograph_blob_non_dw/vmax   <f4
       micrograph_blob_non_dw/import_sig   <u8
       micrograph_thumbnail_blob_1x/path    |O
        micrograph_thumbnail_blob_1x/idx   <u4
      micrograph_thumbnail_blob_1x/shape   <u4       (2,)
     micrograph_thumbnail_blob_1x/format    |O
  micrograph_thumbnail_blob_1x/binfactor   <u4
micrograph_thumbnail_blob_1x/micrograph_path    |O
       micrograph_thumbnail_blob_1x/vmin   <f4
       micrograph_thumbnail_blob_1x/vmax   <f4
       micrograph_thumbnail_blob_2x/path    |O
        micrograph_thumbnail_blob_2x/idx   <u4
      micrograph_thumbnail_blob_2x/shape   <u4       (2,)
     micrograph_thumbnail_blob_2x/format    |O
  micrograph_thumbnail_blob_2x/binfactor   <u4
micrograph_thumbnail_blob_2x/micrograph_path    |O
       micrograph_thumbnail_blob_2x/vmin   <f4
       micrograph_thumbnail_blob_2x/vmax   <f4
                    background_blob/path    |O
                     background_blob/idx   <u4
               background_blob/binfactor   <u4
                   background_blob/shape   <u4       (2,)
                 background_blob/psize_A   <f4
                       rigid_motion/type    |O
                       rigid_motion/path    |O
                        rigid_motion/idx   <u4
                rigid_motion/frame_start   <u4
                  rigid_motion/frame_end   <u4
           rigid_motion/zero_shift_frame   <u4
                    rigid_motion/psize_A   <f4
                      spline_motion/type    |O
                      spline_motion/path    |O
                       spline_motion/idx   <u4
               spline_motion/frame_start   <u4
                 spline_motion/frame_end   <u4
          spline_motion/zero_shift_frame   <u4
                   spline_motion/psize_A   <f4

passthrough_micrographs.cs

uid   <u8
                         movie_blob/path    |O
                        movie_blob/shape   <u4       (3,)
                      movie_blob/psize_A   <f4
            movie_blob/is_gain_corrected   <u4
                       movie_blob/format    |O
              movie_blob/has_defect_file   <u4
                   movie_blob/import_sig   <u8
                      gain_ref_blob/path    |O
                       gain_ref_blob/idx   <u4
                     gain_ref_blob/shape   <u4       (2,)
                    gain_ref_blob/flip_x   <u4
                    gain_ref_blob/flip_y   <u4
                gain_ref_blob/rotate_num   <u4
                  mscope_params/accel_kv   <f4
                     mscope_params/cs_mm   <f4
       mscope_params/total_dose_e_per_A2   <f4
               mscope_params/phase_plate   <u4
                 mscope_params/neg_stain   <u4
              mscope_params/exp_group_id   <u4
               mscope_params/defect_path    |O

From the movies import job, imported_movies.cs

uid   <u8
                         movie_blob/path    |O
                        movie_blob/shape   <u4       (3,)
                      movie_blob/psize_A   <f4
            movie_blob/is_gain_corrected   <u4
                       movie_blob/format    |O
              movie_blob/has_defect_file   <u4
                   movie_blob/import_sig   <u8
                      gain_ref_blob/path    |O
                       gain_ref_blob/idx   <u4
                     gain_ref_blob/shape   <u4       (2,)
                    gain_ref_blob/flip_x   <u4
                    gain_ref_blob/flip_y   <u4
                gain_ref_blob/rotate_num   <u4
                  mscope_params/accel_kv   <f4
                     mscope_params/cs_mm   <f4
       mscope_params/total_dose_e_per_A2   <f4
               mscope_params/phase_plate   <u4
                 mscope_params/neg_stain   <u4
              mscope_params/exp_group_id   <u4
               mscope_params/defect_path    |O

Our Motion Corr job was run offline and the output with --loglevel=info is:

Writing per-movie star files into MovieExport/Movies/
Creating movie data_general tables
Copying micrograph size
Traceback (most recent call last):
  File "/xxx/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rlnImageSizeZ'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/xxx/software/pyem/csparc2star.py", line 170, in <module>
    sys.exit(main(parser.parse_args()))
  File "/xxx/software/pyem/csparc2star.py", line 59, in main
    data_general = metadata.cryosparc_2_cs_movie_parameters(cs, passthrough=pt, trajdir=trajdir, path=args.micrograph_path)
  File "/xxx/software/pyem/pyem/metadata/cryosparc2.py", line 222, in cryosparc_2_cs_movie_parameters
    data_general[star.Relion.MICROGRAPHDOSERATE] /= data_general[star.Relion.IMAGESIZEZ]
  File "/xxx/software/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/frame.py", line 3458, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/xxx/software/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
    raise KeyError(key) from err
KeyError: 'rlnImageSizeZ'

Z size is is something cryosparc should automatically read from the input movies I’m guessing? Perhaps there is a way for me to modify the .cs file to include an extra field if this is missing? Or is it stored in ‘shape’?

Thanks so much for you help on all of this!

Max

@maxm that’s right, it comes from ‘movie_blob/shape’. You get the error even if the passthrough file is first on the csparc2star.py command?

@maxm I just made a commit that should let it get rlnImageSizeZ from the passthrough OR the main .cs file regardless of the order.

2 Likes

Hi Daniel,

Some progress! I just reinstalled and attempted the conversion with both potential orders.

The error persists when Passthrough is second:

Writing per-movie star files into MovieExport/Movies/
Creating movie data_general tables
Copying micrograph size
Copying micrograph size
Traceback (most recent call last):
  File "xxx/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rlnImageSizeZ'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/xxx/pyem/csparc2star.py", line 177, in <module>
    sys.exit(main(parser.parse_args()))
  File "/xxx/pyem/csparc2star.py", line 59, in main
    data_general = metadata.cryosparc_2_cs_movie_parameters(cs, passthrough=pt, trajdir=trajdir, path=args.micrograph_path)
  File "/xxx/pyem/pyem/metadata/cryosparc2.py", line 223, in cryosparc_2_cs_movie_parameters
    data_general[star.Relion.MICROGRAPHDOSERATE] /= data_general[star.Relion.IMAGESIZEZ]
  File "/xxx/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/frame.py", line 3458, in __getitem__
    indexer = self.columns.get_loc(key)
  File "xxx/miniconda3/envs/pyem/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
    raise KeyError(key) from err
KeyError: 'rlnImageSizeZ'

However, with passthrough as the first argument there is a new error message…

Writing per-movie star files into MovieExport/Movies/
Creating movie data_general tables
Copying movie size
Copying movie size
Reading movie trajectory files
Traceback (most recent call last):
  File "/xxx/csparc2star.py", line 177, in <module>
    sys.exit(main(parser.parse_args()))
  File "/xxx/pyem/csparc2star.py", line 62, in main
    for mic in metadata.cryosparc_2_cs_motion_parameters(cs, data_general, trajdir=trajdir):
  File "/xxx/pyem/pyem/metadata/cryosparc2.py", line 249, in cryosparc_2_cs_motion_parameters
    trajfile = cs['rigid_motion/path'][i].decode('UTF-8')
ValueError: no field of name rigid_motion/path

Thanks again,
Max

Hi @DanielAsarnow,

pyem version: git cloned on 3 Dec.

Use case: csparc2star.py conversion of particles extracted from micrographs imported without connected source movies (i.e. motion-correction performed outside of cryoSPARC).

Error message:

Traceback (most recent call last):
  File "/lmb/home/ylee/software/pyem/csparc2star.py", line 177, in <module>
    sys.exit(main(parser.parse_args()))
  File "/lmb/home/ylee/software/pyem/csparc2star.py", line 93, in main
    if args.flipy and not args.inverty:
AttributeError: 'Namespace' object has no attribute 'flipy'

Temporary workaround: Commenting out lines 93-97.

Cheers,
Yang

I fixed the gain reference issue, added unlimited passthroughs / extra .cs files, and fixed the --flipy argument. Hopefully that fixes exporting the files for everyone - just give the main .cs file first and then all the other .cs files you want until all the required fields are found.

I also got my converted files to actually yield good Polishing output. The key was in this thread. We know that Relion is going to see our particles Y-flipped from how cryoSPARC saw them when we process the movies again. That means that the new extracted particle image will have a different shift and orientation than the one in the converted particles file. The transformation that corrects the particles to match the movies as Relion sees them is the diag(1,-1,-1) rotation described in the thread, plus multiplying Y shift by -1.

The map for polishing can then be made by relion_reconstruct, or the cryoSPARC map can be simply flipped itself (first on Z, then on Y).

--flipy applies rlnOriginYAnsgt = -rlnOriginYAngst and transforms by diag(1,-1,-1)

Under normal circumstances where you run Patch Motion in cryoSPARC, then export particles with --flipy and NOT --inverty and Polishing is correct for now, however the current convention for --inverty is an unfortunate choice. I would like to swap the meaning of the --inverty. The reason is that right now to get the same numerical coordinate in/out of cryoSPARC in repeat export/import cycles you need --inverty, but that should be the default. Instead --inverty should mean when you really need to flip the particles because you will extract using Relion.

But I don’t want to break old scripts that people may have…I will update the instructions on the github to reflect these points soon.

2 Likes

There is also an unexpected result caused by different patterns of local motion correction - this is important to understand for testing if things are working.

If you repeat motion correction via Relion/Motioncor2 and then run Relion Extract with the exported particles with --flipy, these particles will be bad, especially with recentering. It seems that different patterns of local motion correction scramble the aligned shifts. Thus running Extract to test the particles before Polishing won’t work. You have to just --flipy and then go directly to the movies using the original flipped map or a new reconstruction using the original particle images (but these transformed poses).

2 Likes