PyEM-csparc2star.py

Hello,

I keep getting this error when trying to use the cspar2star.py:

VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify ‘dtype=object’ when creating the ndarray
df[model[k]] = pd.DataFrame(np.array(
Columns must be same length as key
Traceback (most recent call last):
File “/home/xx/data/pyem/csparc2star.py”, line 42, in main
df = metadata.parse_cryosparc_2_cs(cs, passthroughs=args.input[1:], minphic=args.minphic,
File “/data/xx/pyem/pyem/metadata.py”, line 403, in parse_cryosparc_2_cs
df = cryosparc_2_cs_model_parameters(cs, df, minphic=minphic)
File “/data/xx/pyem/pyem/metadata.py”, line 334, in cryosparc_2_cs_model_parameters
df[model[k]] = pd.DataFrame(np.array(
File “/xx/anaconda3/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3160, in setitem
self._setitem_array(key, value)
File “/xx/anaconda3/envs/pyem/lib/python3.8/site-packages/pandas/core/frame.py”, line 3189, in _setitem_array
raise ValueError(“Columns must be same length as key”)
ValueError: Columns must be same length as key
Required fields could not be mapped. Are you using the right input file(s)?

When I initially tested the script it worked but suddenly with new jobs I stopped working.

I’m doing:

  1. Export job in cryoSPARC
  2. use the full path as input for csparc2star.py

Thanks in advance.

Hi LTP,
You must make sure that there are not any unexpected passthrough items. You can check this in the outputs tab of the CryoSPARC job you are using as input for pyem. I had this problem following multi-class ab-initio due to the alignments for all classed being passed through for individual classes. Running the script without the passthrough information should work, but that may not generate all the information you need in the star file, depending on what your next step is.
I ended up re-running cryoSPARC jobs to get rid of the extra passthrough entries (not ideal), but I suspect if you are good with the python package Numpy you could manually delete the offending columns.