2D Classification Error/failure

Hi all,

We’ve been running up against an error that notably only occurs during 2D classification, across projects. Started about a week ago for no apparent reason. We’re using v4.6.2. Most recently happened on a 2DClass job with 446,000 particles extracted at 384 px → 256 px, 100 classes with default params (so not particularly demanding). We tried restarting the workstation and cryosparc, cleared the cache, made sure there’s no linux or cuda updates. Any advice would be much appreciated. Thanks!

Traceback (most recent call last):
File “/home/cryosparcuser/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py”, line 2304, in run_with_except_hook
run_old(*args, **kw)
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 136, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 137, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File “cryosparc_master/cryosparc_compute/jobs/class2D/newrun.py”, line 640, in cryosparc_master.cryosparc_compute.jobs.class2D.newrun.class2D_engine_run.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 1399, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.compute_resid_pow
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 382, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File “/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py”, line 232, in _require_cuda_context
return fn(*args, **kws)
File “/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/api.py”, line 189, in pinned_array
buffer = current_context().memhostalloc(bytesize)
File “/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1378, in memhostalloc
return self.memory_manager.memhostalloc(bytesize, mapped, portable, wc)
File “/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 889, in memhostalloc
pointer = allocator()
File “/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 884, in allocator
return driver.cuMemHostAlloc(size, flags)
File “/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE

Just returning to say it looks like adding export CRYOSPARC_NO_PAGELOCK=true to the config.sh file as I saw in a couple other posts seems to have solved the issue for now.

1 Like

Thanks @EMagn for this update. Out of curiosity, do you remember on which worker node you observed

and can post the outputs of these commands on that worker node:

uname -a 
free -h