Hi, I did 3D classification of a CryoSPARC created particle set in Relion3 and want to import the selected subset of particles back to CS for further processing. Although in the .star file the particle stack point to the original ones created by CS (in the JXXX folder), CS treats the imported particle stack as an independent one. There are two disadvantages: (1) In the subsequent processing, CS will make another copy of the original particle stack in /scratch (2) the imported stack loses some useful information such as the “scale” for each particle before entering Relion3.
I wonder if I can use the Particle Sets Tool (intersect) in CS to re-connect the imported particle stack to the original ones. This should be possible because they are pointing to the same .mrcs particle files.
@zhangrui_wustl Pre-existing particle UIDs can be transferred to re-imported particles by manipulation of pre-export and post-(re-)import .cs files in python if some identifying information has been preserved throughout export from cryoSPARC, processing outside cryoSPARC, re-importation to cryoSPARC.
A_particles.cs be the cryoSPARC metadata file that was earlier used for particle export from cryoSPARC and that includes UIDs you’d like to use going forward. Let
B_particles.cs be the result of a recent re-importation of particles into cryoSPARC, which includes interesting additional attributes, but also a new, unwanted set of cryoSPARC UIDs. Let
cryosparc_compute.dataset.Datasets derived from
In case each
blob/idx items and those items’ values haven’t changed, one can
B_df dataframes, respectively, using the
- drop the unwanted
uid column from
- keep just the
['uid', 'blob/path', 'blob/idx'] columns from
- “inner” merge
C_data = cryosparc_compute.dataset.Dataset().from_dataframe(merged_df)
- write the new
Dataset to disk:
- The steps above are a motivational outline, not a tested sequence of commands that can be pasted verbatim into a script.
- Preservation of
blob/idx is an optimistic assumption and does not apply to all export/processing/import workflows.
Hi, I really appreciate your reply, but this seems too complicated for me
I don’t think I have blob info for the re-imported particle set, assuming blob means the coordinates on the micrographs.
Is it possible for you guys to add an option in Particle Sets Tool (intersect), which only checks the particle files (.mrc or .mrcs), while keeping all the metadata from A_data?
@zhangrui_wustl To clarify,
'blob/path' is a file path,
'blob/idx' is an integer. Taken together, these values are an alternative (to the
uid) identifier of a particle.
'blob/idx' are no longer present explicitly in the data to be reimported, inspection of those data may reveal that a straightforward transform of
'blob/idx', such as a concatenation, has been preserved.