Fewer particles doesn't reduce SSD space required?

Why is it that for multiple job types (2D classification, 3D refine, etc.) that selecting fewer particles or splitting a particle stack does not change the amount of SSD space required in the downstream job?

I can provide specific examples if required. I’m running v3.1.

Thanks for any suggestions!
Nick

When you say “Splitting a particle stack” do you mean using the Particle Sets Tool? I’m fairly sure that 2D and 3D classification and the particle sets tool all just select a subset of particles to work from, but leave your particle stack unchanged. To use less space on your SSD, you’ll have to re-extract your selected/split particles, creating a new stack that’s only got what you want.

Extract job writes an mrc (mrcs in relion) file per micrograph with all the particles from their corresponding micrographs. Basically, stack of 2D images of all the particles in a single mrc file. As @posertinlab said, a simple selection job will not change those stack of particles, instead, it will create a cs file (star file in relion) with the metadata referring to the location of selected (or unselected) particles (something like 1@imicrograph01.mrc, 15@micrograph01.mrc 3@micrograph02.mrc … etc. numbers 1,15,3 designates the location of the particles in the stack). It does not create a new stack of files. Therefore subsequent refinement job will need to load the entire stack even though some of it selected to be refined. I am not sure what happens to the stack files from which no particles was selected.

Alpay

Thanks @posertinlab and @alburse, that makes more sense to me now.

In my case, I had run an extract job with ~3.5m particles and I was indeed hoping to split them up in smaller batches using the Particle Sets Tool. Simply splitting the particle stack into smaller batches did not reduce the particle stack size to fit on the SSD, however, I am able to run an Ab-Initio job telling it to only use 300k particles (from the 3.5m stack) and it runs fine on the SSD. This makes me think it should be possible to have an option in the Particle Sets Tool to ‘truly’ split the particle stack, rather than having to re-extract and use up more disk space?

Maybe someone on the CryoSPARC @team can comment on this possibility?

Nick

@nschnick You should be able to re-consolidate each particle sub-stack through Downsample Particles. Just make sure not to apply cropping or resolution filters.

6 Likes

Thanks @leetleyang! I had missed this issue is also mentioned here: https://guide.cryosparc.com/processing-data/tutorials-and-case-studies/tutorial-ssd-particle-caching-in-cryosparc#tips-and-tricks

@team It might be nice if there were an option in 3D refine and some other job types that use SSD to split up the particle stack into batches based on SSD space directly in that job, but ultimately I’m not sure how practical/useful this would be.

1 Like