Hi Team,
Firstly, I want to thank for everyone’s contributions to the development of this wonderful software.
We are currently using CryoSparc v4.5.3 and have encountered an issue during 2D Classification runs. The problem appears to be related to memory management and turned up recently. Previous classifications in the same project, with comparable particle amounts and box sizes, completed without any issues. However, we run into the following error:
Traceback (most recent call last):
File "/home/cryosparcuser/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 2294, in run_with_except_hook
run_old(*args, **kw)
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 134, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 135, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/jobs/class2D/newrun.py", line 639, in cryosparc_master.cryosparc_compute.jobs.class2D.newrun.class2D_engine_run.work
File "cryosparc_master/cryosparc_compute/engine/newengine.py", line 1383, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.compute_resid_pow
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 374, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py", line 232, in _require_cuda_context
return fn(*args, **kws)
File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/api.py", line 189, in pinned_array
buffer = current_context().memhostalloc(bytesize)
File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 1378, in memhostalloc
return self.memory_manager.memhostalloc(bytesize, mapped, portable, wc)
File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 889, in memhostalloc
pointer = allocator()
File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 884, in allocator
return driver.cuMemHostAlloc(size, flags)
File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE
We aren’t sure if this issue stems from the CUDA driver or if it is simply a case of our GPUs running out of memory. Is there any way of having a look at the estimated memory for certain jobs? We have tested this on two setups: one with 4x2080Ti (4 x 11GB) and another with 4x3090 (4 x 24GB).
Thank you very much for your help!
Cheers,
Philipp