Csparc2star: Re-extraction problem

SaifH · August 4, 2023, 7:30am

Hello Colleagues,

We used csparc2star.py to export particles and a homogeneously refined volume to relion for 3D classification. After successful 3D classification in relion, we imported the particles corresponding to various classes separately. Each particle set is fine if we perform 2D classification, ab initio reconstruction, and homogeneous refinement. However, these are at a down-sampled box-size. If we exchange the blob volume from the original particle set (before we exported to relion), the reconstruction either fails or becomes really poor quality. Also, if we re-extract the particles at a larger box size using these re-imported particles, we do not get any meaningful classes anymore in 2D classification. We would sincerely appreciate any insights and advice into how to address this issue. Thank you.

kookjookeem · August 4, 2023, 5:22pm

Hello @SaifH,

Imported particles are treated as a brand-new set of particles (i.e. the UIDs are different from the original particles), so exchanging the lower level blob input for original particles wouldn’t work. I assume the particle coordinates info isn’t lost since you mentioned you could re-extract the particles using the imported particles (locations). Particle coordinate mismatch would be the usual suspect in this situation. Do the re-extracted particles actually look like your particles in the preview? Have you tried --inverty flag in csparc2star.py? You can also try flipping the mics in the Y axis when you re-extract.

Best,
Kookjoo

SaifH · August 4, 2023, 5:58pm

Hello @kookjookeem,

Thanks for your reply. Unfortunately, we are not able to re-extract any meaningful particles in cryosparc using the particles imported from relion as input. While the extract job runs to completion using the particles imported from relion into cryosparc as input, we run into two problems-

Problem 1: We get the following error when we try to inspect particle picks:

[CPU: 244.6 MB Avail: 45.33 GB]
Traceback (most recent call last): File “cryosparc_master/cryosparc_compute/run.py”, line 96, in cryosparc_compute.run.main File “/opt/cryosparc/cryosparc_master/cryosparc_compute/jobs/interactive/run_inspect_picks_v2.py”, line 93, in run mic_med_nccs = [ n.median(mic_uid_to_particle_dset[mic[‘uid’]][‘pick_stats/ncc_score’]) for mic in mics_with_particles ] File “/opt/cryosparc/cryosparc_master/cryosparc_compute/jobs/interactive/run_inspect_picks_v2.py”, line 93, in mic_med_nccs = [ n.median(mic_uid_to_particle_dset[mic[‘uid’]][‘pick_stats/ncc_score’]) for mic in mics_with_particles ] File “/opt/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 683, in getitem return Column(get_data_field(self._data, key), self._data) File “/opt/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dtype.py”, line 113, in get_data_field return makefield(field, get_data_field_dtype(data, field)) File “/opt/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dtype.py”, line 119, in get_data_field_dtype raise KeyError(f"Unknown dataset field {field} or field type {t}") KeyError: ‘Unknown dataset field pick_stats/ncc_score or field type 0’

Problem 2: We see the following 2d classes if we perform 2D classification.

Hence, I am not sure if we are retaining meaningful information upon re-extracting the particles. That said, I will reiterate that we run into these problems only when we try to re-extract particles in cryosparc, which we have to do because the relion 3d classification was performed with a highly downsampled dataset from cryosparc.

Thanks again and looking forward to your advice.

kookjookeem · August 4, 2023, 10:27pm

Regarding problem 1: it’s complaining that there’s no pick power/correlation scores. You can use manual picker to bypass this and view particle locations.

I would first check your particle locations in manual picker to confirm if you are extracting particles with correct xy coordinates.