Import particle stack from Relion to CryoSPARC

wtempel · January 12, 2022, 9:55pm

@zhangrui_wustl Pre-existing particle UIDs can be transferred to re-imported particles by manipulation of pre-export and post-(re-)import .cs files in python if some identifying information has been preserved throughout export from cryoSPARC, processing outside cryoSPARC, re-importation to cryoSPARC.
Let A_particles.cs be the cryoSPARC metadata file that was earlier used for particle export from cryoSPARC and that includes UIDs you’d like to use going forward. Let B_particles.cs be the result of a recent re-importation of particles into cryoSPARC, which includes interesting additional attributes, but also a new, unwanted set of cryoSPARC UIDs. Let A_data and B_data be cryosparc_compute.dataset.Datasets derived from A_particles.cs and B_particles.cs, respectively.
In case each A_data and B_data include blob/path and blob/idx items and those items’ values haven’t changed, one can

convert A_data and B_data to A_df and B_df dataframes, respectively, using the cryosparc_compute.dataset.Dataset.to_dataframe() method
drop the unwanted uid column from B_df
keep just the ['uid', 'blob/path', 'blob/idx'] columns from A_df
“inner” merge A_df and B_df on ['blob/path', 'blob/idx']
create C_data = cryosparc_compute.dataset.Dataset().from_dataframe(merged_df)
write the new Dataset to disk: C_data.to_file("C_particles.cs")

Caveats:

The steps above are a motivational outline, not a tested sequence of commands that can be pasted verbatim into a script.
Preservation of blob/path and blob/idx is an optimistic assumption and does not apply to all export/processing/import workflows.