Difference of exposure sets by movie path rather than exposure object

I am trying to get ahead on processing some data as it is slowly transferring from the microscopy core. What I’d like to do is import all .tif files from a directory (say, micrographs/*.tif, J1) and begin motion correction (J2). Then, say an hour or two later, launch J3 which is a clone of J1. Then take the difference of J3-J1 (J4) and input that into a new motion correction job on J4 (J5) to avoid duplicating work.

I thought that using the exposure sets job would let me do this, but taking the intersection and looking at A_minus_B and B_minus_A, it seems to me that the exposure sets job operates on exposure objects, not the paths of those exposures, because A_minus_B and B_minus_A have the size of A and B respectively, even though they both contain several of the same micrographs (precisely, A is a superset of B, so I would expect B_minus_A to have zero exposures).

I searched some for an answer to this question but didn’t find anything, I apologize if I missed something.

Hey @posertinlab,

This is actually one of the main use cases we built cryoSPARC Live for. In a cryoSPARC Live session, you’re able to specify a directory that cryoSPARC should watch to find new movies.
As new exposures are found, a preprocessing worker will pick them up and complete Patch Motion Correction, CTF Estimation, blob/template picking and particle extraction all in one go. You can then continue processing in the main cryoSPARC application, or continue onto streaming 2D Classification and 3D Refinement.

In the meantime, if you’d still like to use the Exposure Sets tool to only process the remainder of movies, I’ve modified the job to use the movie’s path rather than the unique identifier to complete the operations.
You can get the new job (compatible with cryoSPARC v3.0+) by downloading it here:
wget https://structura-assets.s3.amazonaws.com/exposure_sets_path_intersection-v3.1/run_sets.py

Make a backup of the existing run_sets.py and copy the new file on the master node into the folder: cryosparc_master/cryosparc_compute/jobs/utilities
You can then re-run the job and it will use the new code.

We will release the option to choose the paths field rather than the unique identifier in the next version of cryoSPARC. Until then, the new job will only use paths when using the “intersection” mode in both the Particle Sets job and Exposure sets job.

1 Like

Wow thank you! I was halfway through trying to do this with bash scripting, this is a much better solution given my abysmal skill in bash.

Great! Let me know how the modified job works out for you.
Note you can manipulate .cs files pretty easily using python. We have a tutorial with an example on how to do this here:

https://guide.cryosparc.com/processing-data/tutorials-and-case-studies/manipulating-.cs-files-created-by-cryosparc

1 Like

Hi @posertinlab,

Your request is now a feature in the Exposure Sets Tool and Particle Sets Tool jobs in v3.3.0 (released December 1, 2021) available via the “Field to Intersect” parameter:
https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/utilities/job-exposure-sets-tool

1 Like