Patch Motion Correction - RuntimeError: Could not allocate GPU array: CUDA_ERROR_OUT_OF_MEMORY

Hi @olibclarke, we’re aware of this issue and are looking into it. Sorry for the inconvenience in the meantime.

2 Likes

Thanks Harris - is there a patch for the meantime to revert Patch Motion to the previous version? Or any suggestion for reducing memory requirements beyond just using the low memory mode?

1 Like

Unfortunately not… The change that caused this had nothing to do with patch motion specifically, it was the changes involved in shipping our own cuda version. The workaround would be to downgrade CryoSPARC versions.

1 Like

Just to add, while for others the issue resolved when using Full Frame Motion Correction, we are still getting a similar error with Full Frame.

The error is

numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY

Thank you for looking into this! In the meantime we will downgrade as suggested.

Welcome to the forum @AlexHouser.
On which GPU model did you observe this error?
nvidia-smi --query-gpu=name --format=csv

NVIDIA GeForce RTX 2080

After talking to our neighboring lab, they have been having the same issue but have been working around it by Fourier cropping to 1/2. This worked for us!

Overriding the number of knots (to X=6, Y=4) seemed to do the trick for us using super-res K3 data on a 2080Ti (using F-crop=1/2 also).

Just using F-crop=1/2 with patch motion didn’t do the trick, but reducing the number of knots as well worked

Our neighboring lab also recommended overriding the knots (Z=5, Y=5, X=7) as well as cropping to 1/2, but we found with our data only the cropping was necessary. Both our lab and their lab are using super-res K3 data.

1 Like

Maybe has to do with the number of movie frames? For us (50 frame super-res K3 movies on 2080Ti cards) it only works with X=6, Y=4, low memory mode, F-crop=1/2. Any more knots, or switching off low memory mode, or altered, F-crop, and it crashes. Glad to have a workaround!

EDIT: I spoke too soon - it ran ok for 15 mics and then started failing again :frowning: Back to tweaking params

Hi @olibclarke, I have a potential workaround that may address this. For background, v4.4 includes a new GPU memory management system (using the numba Python library) that does not immediately free memory when it’s no longer required. Instead, it frees in batches or when memory is low.

Your Patch Motion job appears to fail during a special allocation step that is unaware of this memory management system. So there may be some GPU memory that could be freed to make this work.

We’ll should have a fix for this in a future version of CryoSPARC, but in the mean time you could try disabling batched-memory deallocation by adding the following line to cryosparc_worker/config.sh:

export NUMBA_CUDA_MAX_PENDING_DEALLOCS_COUNT=0

Let me know if you get a chance to try this and it works for you.

@AlexHouser I am not sure whether the same fix will apply to you; based on the text, the error appears to be coming from a different place that is correctly managed. We are still investigating other memory usage changes in v4.4.

1 Like

Great I will give this a go, thanks!!

I am using CS4.4.1. I tried adding the above mentioned line in the cryosparc_worker/config.sh. But the problem did not solved, showing up same as mentioned by @olibclarke.

1 Like

Welcome to the forum @Suyog.

What is the output of the command
nvidia-smi
on the CryoSPARC worker?

Thank you for your reply. After reading this thread, I realized that, I am using GPU with 8 GB VRAM. I used the same patch correction job by using F-crop = 1/4, and it worked for me. In this also, I noticed that both of my GPUs were full in use. After reading some suggestion above, that patch motion correction is working fine in CS v4.3.1 or below. Is it better for me to rollback to CS v4.3.1 for my set of hardware configurations or else I have to use the F-crop =1/4? I have attached the nvidia-smi output below:

These are tough choices. For a potential alternative, have you already tried

Yes, I have tried that. Here is the screenshot of the file:

Adding that line to cryosparc_work/config.sh did not fix it for us either. Interestingly we are no longer able to fix it with Fourier cropping, low memory mode, and overriding knots during Patch Motion correction for our latest data set collected. We ended up having to roll back to v4.3.1.

1 Like

Thank you for the suggestion. We also downgraded to v4.3.1, and Patch motion correction is working fine. Thank you @AlexHouser @nfrasser @wtempel.

This problem still persists in v4.5.

Movies: superres K3, 8184x11520 px, 80 frames.
System:
Linux GPU-4X-2080Ti 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
NVIDIA-SMI 530.30.02
Driver Version: 530.30.02
CUDA Version: 12.1
GPUs: 4xRTX 2080Ti 11GB, RAM: 256GB

Using F-crop=1/2,1/8,1/16 and or different number of knots doesn’t help.
Adding NUMBA_CUDA_MAX_PENDING_DEALLOCS_COUNT=0 also doesn’t change anything.
The same task used to run fine in v4.3.

Here is the full ouput:

Traceback (most recent call last):
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 851, in _attempt_allocation
return allocator()
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1054, in allocator
return driver.cuMemAlloc(size)
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/eugene/cryosparc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py”, line 59, in exec
return self.process(item)
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 210, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 213, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 242, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 219, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/patchmotion.py”, line 292, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/patchmotion.py”, line 628, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 390, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File “/home/eugene/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 270, in empty
return device_array(shape, dtype, stream=stream)
File “/home/eugene/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 226, in device_array
arr = GPUArray(shape=shape, strides=strides, dtype=dtype, stream=stream)
File “/home/eugene/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 21, in init
super().init(shape, strides, dtype, stream, gpu_data)
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devicearray.py”, line 103, in init
gpu_data = devices.get_context().memalloc(self.alloc_size)
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1372, in memalloc
return self.memory_manager.memalloc(bytesize)
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1056, in memalloc
ptr = self._attempt_allocation(allocator)
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 863, in _attempt_allocation
return allocator()
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1054, in allocator
return driver.cuMemAlloc(size)
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/eugene/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY