Downsampling problem in combined particle stacks

Noha_Elshamy · November 20, 2024, 2:41pm

Hi,

I combined 3 particle stacks from 3 datasets. Before I did that, I made sure that I extracted the particles from each dataset’s micrographs at the same box size of 540 px. Before that for refinement of each stack, I cropped two of the sets from 540 to 256, and in one dataset (my first) I cropped from 512 to 256. After I picked the desired particles from each of these datasets, I extracted each stack from the micrographs at 540 px, and combined using exposure group utilities (median) to average the CTF values.
I ended up with a group of 67k particles that I ran through homogenous refinement, and got a reconstruction @ 2.58 angstroms. I want to classify these particles using 3DVA, but the job fails as the size is too large for my current disk space. I want to downsample these particles to 128 px, but when I try to run the downsample job, I got an error “particles must all have same alignment pixel size”.
I do not understand why the alignment pixel size is different when I extracted each stack at the same size of 540 px. How can I solve this problem?
Now my particle stacks have been exported from different disk locations by creating a soft link in the current processing disk, so I feel that it will be too complicated to reextract the particles at the full size after running the cluster mode of 3DVA, since I won’t know which dataset the particles came from…so does this mean I will have to create soft links for the micrographs too, and use all of them as an input for the extract job??? these will be around 30 k micrographs…!!! So I am not sure how to navigate this. What could be the simplest way to run 3DVA on these combined particles?

rposert · November 25, 2024, 8:21pm

Hi @Noha_Elshamy, I’d love to help figure this out for ya.

First, could you verify that I understand correctly what you’ve done:

You collected 3 datasets with the same pixel size
You extracted particles from all two datasets with the same box size (540 px) and one with a different box size (512 px)
You cropped all three datasets to 256 pixels. Now you have 3 datasets with the same box size and the same pixel size
You refined all three datasets together
You re-extracted the particles from all three datasets, this time all with a box size of 540 pixels. There are 67k particles in this new particle stack.
You tried to use Downsample Particles to take the stack from step 5 from 540 px → 128 px. This job failed with the error “particles must all have same alignment pixel size”

Noha_Elshamy · November 26, 2024, 10:07am

All is correct, except that I refined the datasets independently as they were not taken at the same time. I usually process at the 256 pixel size until I reach the resolution limit in my refinement. At that point, I would re-extract the particles from the micrographs at 540 px. What I mean is that I ended up with three constructions from my refinements, each at 540 px. I then exported the particles from each of the three constructions, imported them in a new workspace for processing the three sets, combined them (I did that by running them through a new ab-initio job, 1 class) followed by refinement.

I want to note also that because the defocus range for the 1st dataset was a little different from the subsequent two sets, I had to run the particles through an “Exposure group utilities” job in which I used a combine strategy of “median”.

rposert · November 26, 2024, 6:38pm

I see! So does the workflow below look right?

You collected 3 datasets with the same pixel size
You extracted particles from all two datasets with the same box size (540 px) and one with a different box size (512 px)
You cropped all three datasets to 256 pixels. Now you have 3 datasets with the same box size and the same pixel size
You refined all three datasets independently
You re-extracted particles with a box size of 540 px from all three datasets. You did not downsample or crop the images
You exported the particles from step 5 and imported them to a new workspace
You performed an Ab initio Reconstruction and Homogeneous or Non-Uniform Refinement with the particles from step 6. This refinement did not encounter any errors, and the map looked good.
You tried to perform 3D Variability Analysis, but this failed.
You tried to use Downsample Particles on the particles from step 7 (after the refinement), but this failed with the error “particles must all have same alignment pixel size”

If anything there is incorrect, could you let me know? Also, could you paste (as text) the error message you saw when 3DVA failed in step 8?

Noha_Elshamy · November 27, 2024, 1:36pm

Yes this is correct. But as I noted, homogenous refinement with the option of “CTF refinement” “On” failed before I did the combination of the particles stacks using the exposure group utilities and using a combine strategy as “median”

Noha_Elshamy · December 1, 2024, 12:17pm

I am sorry I forgot to reply to the text message of the error from 3DVA. I deleted that job so I cannot paste it, but it ended with “out of memory”

rposert · December 9, 2024, 4:57pm

Hi @Noha_Elshamy! In the future, the traceback and error message are helpful for diagnosing what might be going wrong, so it’s helpful to keep the failed job around!

In any case, for jobs like 3DVA, an Out Of Memory error typically indicates that the box size is too large. If you try to create a new 3DVA job (like step 8) but set the Filter Resolution parameter (in older versions of CryoSPARC, this parameter is called Target Resolution) to something lower (maybe 10 or 14 Å), does the job run successfully?

Noha_Elshamy · December 24, 2024, 8:06pm

Hi again,
Sorry for taking so much time to reply, we were updating our processors and dealt with a lot of issues afterwards with the cryosparc update too…
I ran the 3DVA with a filter resolution of 14Å and it wouldn’t run to completion. It failed after 23 minutes and gave the following error message:

Traceback (most recent call last):
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 851, in _attempt_allocation
return allocator()
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1054, in allocator
return driver.cuMemAlloc(size)
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 129, in cryosparc_master.cryosparc_compute.run.main
File “cryosparc_master/cryosparc_compute/jobs/var3D/run.py”, line 546, in cryosparc_master.cryosparc_compute.jobs.var3D.run.run
File “cryosparc_master/cryosparc_compute/jobs/var3D/run.py”, line 323, in cryosparc_master.cryosparc_compute.jobs.var3D.run.run.E_step
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 400, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.load_models_rspace
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 398, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File “/home/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 376, in empty
return device_array(shape, dtype, stream=stream)
File “/home/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 332, in device_array
arr = GPUArray(shape=shape, strides=strides, dtype=dtype, stream=stream)
File “/home/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 127, in init
super().init(shape, strides, dtype, stream, gpu_data)
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devicearray.py”, line 103, in init
gpu_data = devices.get_context().memalloc(self.alloc_size)
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1372, in memalloc
return self.memory_manager.memalloc(bytesize)
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1056, in memalloc
ptr = self._attempt_allocation(allocator)
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 863, in _attempt_allocation
return allocator()
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1054, in allocator
return driver.cuMemAlloc(size)
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY

Noha_Elshamy · December 24, 2024, 8:07pm

I think there is no way this would work without downsampling…but even this is not going smoothly…

rposert · December 30, 2024, 2:02pm

Hi @Noha_Elshamy! You could definitely try directly downsampling the images using the Downsample Particles job, but the Filter Resolution paramter should do this on-the-fly in 3DVA. If you are still seeing OOM errors with your desired filter resolution, it may unfortunately be the case that the GPUs do not have enough free VRAM to handle this dataset.

Noha_Elshamy · December 31, 2024, 9:38am

I initially tried to do the downsampling, but the job failed with the error “particles must all have same alignment pixel size”…shouldn’t they have the same pixel size? I can try to re-extract each dataset again at 128 px, but for some reason the number of particles always decrease when I try to re-extract, even though the particles were already extracted before after the template picking job, and should not have any particles at the edges to be cancelled, for example. I was hoping not to lose any particles because all I have from the three datasets is 67k particles.