Delete excluded micrographs


As far as I know, there is no easy way to track the name of each micrograph that ended up in the different groups after manual curation of exposures. I find that it could be quite handy to have a list of file names of the excluded micrographs so that one can delete useless raw data micrograph/exposure files. Wouldn’t that be nice?
At the moment, what workarounds do you use for such purpose?


Hi André,

You should be able to convert the cs file for the excluded mics to a star file using, then make a list from that that you can use to delete the junk. I haven’t actually tried that, but I don’t see why it wouldn’t work.


Hi @AndreGraca,

You can do this by first downloading the movie_blob .cs file output from the exposures_rejected output result group of the Manually Curate Exposures job.
Screen Shot 2020-09-15 at 3.58.21 PM
You can also just get the path to the .cs file on the master node by navigating to the Outputs tab and pressing the “copy path” button on the movie_blob output:

Once you have the .cs file, you can then open it up using the instructions from our guide on manipulating .cs files here:

You can then find the filenames, and either delete the files manually, or use python to do this for you.

For example, open a shell on the master node and run cryosparcm icli to start an interactive python session, then run the following:

from cryosparc2_compute import dataset
exposure_dset = dataset.Dataset() #initialize the dataset object
dataset_path = "<path_to_cs_file_here>"
exposure_dset.from_file(dataset_path) #load the .cs file['movie_blob/path'] #will print out a sample of all the values in this field

# the following will write all the filenames to a text file that 
# can be piped to a unix delete command
with open("exposures_to_delete.txt", 'w') as openfile:
    for file_to_delete in['movie_blob/path']:
        openfile.write(file_to_delete + '\n')

#the following will delete the files sequentially
import os
for file_to_delete in['movie_blob/path']:
        print("Unable to delete {}".format(file_to_delete))

I had to do this recently and found some changes in the cryosparc_compute package (eg some differences in the properties of the Dataset object). Here’s an updated example for future searchers.

from cryosparc_compute import dataset
dpath = "<path_to_cs_file_here>"
dset = dataset.Dataset.load(dpath)
for file_to_delete in dset['micrograph_blob/path']:
        print("unable to delete {}".format(file_to_delete))

(Note, do this in the CryoSPARC Project directory where the relevant jobs will be found)