Ab initio job fails after some time in v4.5.1

Hi,
we have just updated to v4.5.1. My colleague has tried some ab-initio jobs and the jobs run for some time but then fail with the error message pasted below. She tried 3 classes on a server with RTX3090 GPUs.
Thanks for your help!

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 115, in cryosparc_master.cryosparc_compute.run.main
  File "cryosparc_master/cryosparc_compute/jobs/abinit/run.py", line 316, in cryosparc_master.cryosparc_compute.jobs.abinit.run.run_homo_abinit
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1194, in cryosparc_master.cryosparc_compute.engine.engine.process
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1195, in cryosparc_master.cryosparc_compute.engine.engine.process
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1134, in cryosparc_master.cryosparc_compute.engine.engine.process.work
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 348, in cryosparc_master.cryosparc_compute.engine.engine.EngineThread.compute_resid_pow
  File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 374, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
  File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py", line 232, in _require_cuda_context
    return fn(*args, **kws)
  File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/api.py", line 189, in pinned_array
    buffer = current_context().memhostalloc(bytesize)
  File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 1378, in memhostalloc
    return self.memory_manager.memhostalloc(bytesize, mapped, portable, wc)
  File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 889, in memhostalloc
    pointer = allocator()
  File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 884, in allocator
    return driver.cuMemHostAlloc(size, flags)
  File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 348, in safe_cuda_api_call
    return self._check_cuda_python_error(fname, libfn(*args))
  File "/home/cryosparcuser/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 408, in _check_cuda_python_error
    raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE

Please can you

  • email us the failed job’s job report. I will send you a private message with the email address.
  • try and let us know if this helps?