Exposure groupings from multiple acquisitions

rbs_sci · December 4, 2023, 6:18am

Hi CryoSPARC team,

Once more, thank you for all the additions to CryoSPARC in 4.4

Another feature request… …or I’m missing something simple…

When feeding CryoSPARC .xml files during micrograph import, the result from Patch Motion Correction always has a single Exposure Group (easily fixed in Exposure Group Utilities). However, feeding a single Patch Motion Correction job multiple imports outputs multiple groups, based on import job. Beam tilt numbers are still there, but using the Exposure Groups Utility treats the multiple groups as one. This results in the optics groups being a bit messy (different grids, different rotation on the stage, not to mention variation in microscope alignment).

Would it be possible to add an option to keep beam tilt value groups unique between imports? I understand that this will explode the exposure group numbers for data acquired across multiple runs… particularly with EPU when it does 9x9 shift grids! But it can help with difficult data (at least a little).

Something like:

Patch CTF outputs big dataset with 4 Exposure Groups from 4 Imports (via 1 Patch Motion run).
Exposure Groups Utility sees 4 imported Exposure Groups, groups by tilt within each of those Exposure Groups.
Rather than blobbing everything and splitting into, e.g. 69 groups (7x7 plus 5 on each edge seems most common from EPU for us) it splits group 1 via tilts into e.g.: 69 groups, then group 2 via tilts into e.g. 9 groups, group 3 via tilts into e.g. 37 groups, then group 4 it leaves as 1 group (due to not acquiring with AFIS).

Although this could be considered somewhat moot now that the latest update to EPU (EPU 3.6, TEM 3.17.1) actually does put a useful AFIS ID in the filename (although I wish they would zeropad the single-digit IDs)…

It would also be nice to have an option “abandon exposures without beam tilt definition” as there always seems to be one solitary micrograph somehow missing its corresponding .xml file…

Thanks, this ended up being longer than I thought, trying to explain…

kstachowski · December 5, 2023, 6:45pm

Hi @rbs_sci,

We have noted your feature request, but in the meantime, there is an alternative workflow to get this to work as you would suggest.

Import movies with associated .xml files. Alternatively, import the beam shift data separately for movies importede before v4.4.
This will set the exposure groups for each dataset from 0 to N. Now you will have multiple datasets but all having exposure groups 0 to N.
Now a cryosparc-tools job can be used to offset the exposure groups so they are unique across all of the datasets.

Would this generate your intended result?

Additionally, importing using the file name and regex token, you can offset the exposure groups for each each import by running exposure group utilities and changing the parameter start_exp_group_id.

Best,
Kye

rbs_sci · December 6, 2023, 12:29pm

Hi @kstachowski,

Yes, I think the cryosparc-tools flow would work. I’ll have a play.

Thanks.

mmclean · May 7, 2024, 7:31pm

Dear @rbs_sci,

CryoSPARC v4.5 was released today, and Exposure Group Utilities has been updated to now respect the Starting Exposure Group ID parameter.

This means that the following workflow should enable separate clustering of different initial exposure groups (that were imported via different N import movies jobs), without the need for cryosparc-tools. After running a Patch Motion and Patch CTF job on the combined set of exposures from an initial set of N groups:

1. Launch an Exposure Group Utilities job with input selection exposure, in info_only mode, and activate Split Outputs by Exposure Group. This will produce separate output groups for exposures belonging to each of the N groups.
1. For each of the N output exposure_group_X groups:
- a. build a new Exposure Group Utilities job with input selection exposure, in cluster&split mode.
- b. set the number of clusters to the desired amount, based on the beam tilt distribution in this group
- c. set the Starting Exposure Group ID parameter appropriately such that the output exposure group IDs are distinct from those obtained via clustering on the other N groups. For example:
  - the first job can receive a starting ID of 0.
  - If the first initial exposure group is clustered into 69 groups, the second job can receive a starting ID of 69.
  - If the second initial exposure group is clustered into 9 groups, the third job can receive a starting ID of 69+9 = 78, and so on.

This allows N distinct clusterings to occur and for the output exposure groups to remain distinct across each of these N clustering jobs. Let me know if this makes sense!

Best,
Michael