Using multiple workers (instead of GPUs) for CryoSPARC (non live) jobs

My workflow uses cryosparc live for preprocessing steps (motion correction, ctf estimation, and particle picking) and then exporting the particle sets for 2d classification, and refinement in cryosparc. I would like to migrate the preprocessing steps into cryosparc regular but I am hung up on the fact that i can only use one worker with multiple gpus option, instead of multiple workers with one gpu each. Insofar it seems that cryosparc live route contains the infrastructure to manage multiple workers. My request is to implement some type of this into cryosparc regular. The user can either specify one worker and multiple GPUs on that worker, or a single GPU spread over multiple workers.

I will add that this workflow has worked well for quite some time now, but with the new Workflows - CryoSPARC Guide, I might be tempted to migrate away from cryosparc live.

I think the issue which would preclude having multiple workers on a single job would be synchronization of the ssd cache between multiple nodes; or requiring that any ssd caching be excluded for multi worker jobs. (This is assuming that there is minimal to no shared python memory between multi gpu jobs).

With CS live you can have different workers for preprocessing and for reconstruction, but you couldn’t have multiple workers for one of the tasks.

In my workflow it would just be for preprocessing, where the 1 GPU works on a single file, writes motion correction, ctf estimation, and particles for that file, preferably in a shared storage for downstream processing. But I see your point for 2d classification or refinement, a shared ssd cache is desired.

Hi, just to chime in regarding Workflows: you can encapsulate preprocessing across multiple workers using Exposure Sets Tool and adjusting the number of splits based on how many workers you’d like to run on. Once the Workflow has been applied you can multi-select a chain of jobs and queue them onto a separate worker. Here’s an example:

You can import the following Workflow JSON into your instance by navigating to the Workflows tab in the sidebar and clicking on ‘Import Workflow’: multi-worker-pre-processing.json - Google Drive

Regards,
Suhail