Only cache subset of particles

It seems like cryosparc needs to cache all of the particles from an extraction job to run a job using just a subset of the particles. For example, I extract 200gb worth of particles, use the particle sets tool to split them into two equal particle sets of 100gb each, then try to process just one of the halves and it caches all 200gb, limiting what can be run based on available ssd space.

In another similar case, I take particles from cryosparc to relion, run 3D classification, then take particles from one class back to cryosparc. It should only need about 20% of the dataset cached, but it caches all of it. Is there any way around this just to speed it up a bit and free up the ssd?

Thanks,
Aaron

1 Like

Hi @user123,

Unfortunately, we won’t be able to rebuild the caching system at this time to support these features. For now, you can use the “downsample particles” job without modifying any inputs which will create a new, smaller particle stack with only the particles you care about. You can also re-extract the particles using the “Extract from Micrographs” job and the particles.location result. Doing either of these two things will allow you to save time during the caching portion of subsequent jobs.

Thanks a lot, the downsample option worked exactly as needed.