Re-extract particles after losing correspondence to original micrographs

stephan · April 16, 2021, 6:24pm

Thanks for sending over your .cs files so I can debug the code. I forgot we had changed the naming conventions of the paths written out by cryoSPARC, which means if you’re doing a path comparison, you’ll have to “normalize” both paths before being able to compare them (seen below in clean_path_to_match).

Here’s the updated (and much more efficient) code to re-associate extracted particles with newly imported exposures:

import os
from cryosparc_compute import dataset
exp_dset = dataset.Dataset().from_file('/path/to/exposures.cs')
particle_dset = dataset.Dataset().from_file('/path/to/particles.cs')

def clean_path_to_match(path):
    # get the basename of the micrograph
    output_path = os.path.basename(path)
    # remove any leading characters in the path
    output_path = output_path.strip('>')
    # remove any leading UIDs in the path
    output_path = '_'.join(output_path.split('_')[1:])
    return output_path

path_to_uid_map = {clean_path_to_match(path):exp_dset.data['uid'][idx] \
    for idx, path in enumerate(exp_dset.data['micrograph_blob/path'])}

for index, particle in enumerate(particle_dset.get_items()):
    # get the micrograph path associated with the particle
    path_to_match = clean_path_to_match(particle['location/micrograph_path'])
    # assign the uid to the particle dataset after lookup
    particle_dset.data['location/micrograph_uid'][index] = path_to_uid_map[path_to_match]

particle_dset.to_file('/path/to/particles.cs')