cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE

Hello, community.

I have been using my workstation for the last 4 years, which is equipped with RTX 3090 GPUs, and I am currently running CryoSPARC version 4.6.2. It has the NVIDIA 535 driver (CUDA 12.2) and CUDA Toolkit 12.0 installed. Both the nvidia-smi and nvcc --version commands are showing outputs. I cloned an NU refinement job and modified the input parameters: “Symmetry Relaxation Maximization” – “Number of Extra Final Passes to 20” and "Initial Low Pass Filter set to 15 in an effort to break the pseudosymmetry. The previous NU refinement (box size 360, 0.5 million particles) on the same workstation produced a 2.7 Å consensus map. However, the cloned NU refinement job has started crashing unexpectedly. The process fails when it reaches 90k particles and a 4 Å GSFSC, resulting in the following error:

Traceback (most recent call last):
File “/home/uddipan/apps/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py”, line 2304, in run_with_except_hook
run_old(*args, **kw)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/threading.py”, line 953, in run
self._target(*self._args, **self._kwargs)
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2730, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2911, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 714, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.extract_data
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 382, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py”, line 232, in _require_cuda_context
return fn(*args, **kws)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/api.py”, line 189, in pinned_array
buffer = current_context().memhostalloc(bytesize)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1378, in memhostalloc
return self.memory_manager.memhostalloc(bytesize, mapped, portable, wc)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 889, in memhostalloc
pointer = allocator()
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 884, in allocator
return driver.cuMemHostAlloc(size, flags)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE

I would appreciate suggestions from the developers, as I am hesitant to reinstall CryoSPARC for fear of losing database connections to existing jobs.

Thank you.

indicates a problem related to system (as opposed to GPU) RAM.
How much RAM does the workstation have?
What are the outputs of the following command, run once for the former successful and once for the latter failed NU refinement (please replace P99, J199 with applicable IDs)

cryosparcm cli "get_job('P99', 'J199', 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run', 'input_slot_groups')"

Here is it. J295 is the successful one and J446 is the one which crashes every time. The workstation is having 96 GB of RAM.

**uddipan@smbl1:~**/Desktop$ cryosparcm cli "get_job('**P1', 'J295'**, 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run', 'input_slot_groups')"
{'_id': '674d4e33ee8365b6cc0fb428', 'errors_run': [], 'input_slot_groups': [{'connections': [{'group_name': 'particles_0', 'job_uid': 'J293', 'slots': [{'group_name': 'particles_0', 'job_uid': 'J293', 'result_name': 'blob', 'result_type': 'particle.blob', 'slot_name': 'blob', 'version': 'F'}, {'group_name': 'particles_0', 'job_uid': 'J293', 'result_name': 'ctf', 'result_type': 'particle.ctf', 'slot_name': 'ctf', 'version': 'F'}, {'group_name': 'particles_0', 'job_uid': 'J293', 'result_name': 'alignments3D', 'result_type': 'particle.alignments3D', 'slot_name': 'alignments3D', 'version': 'F'}, {'group_name': 'particles_0', 'job_uid': 'J293', 'result_name': 'motion', 'result_type': 'particle.motion', 'slot_name': None, 'version': 'F'}, {'group_name': 'particles_0', 'job_uid': 'J293', 'result_name': 'location', 'result_type': 'particle.location', 'slot_name': None, 'version': 'F'}, {'group_name': 'particles_0', 'job_uid': 'J293', 'result_name': 'pick_stats', 'result_type': 'particle.pick_stats', 'slot_name': None, 'version': 'F'}]}], 'count_max': inf, 'count_min': 1, 'description': 'Particle stacks to use. Multiple stacks will be concatenated.', 'name': 'particles', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'blob', 'optional': False, 'title': 'Particle data blobs', 'type': 'particle.blob'}, {'description': '', 'name': 'ctf', 'optional': False, 'title': 'Particle ctf parameters', 'type': 'particle.ctf'}, {'description': '', 'name': 'alignments3D', 'optional': True, 'title': 'Particle 3D alignments (optional)', 'type': 'particle.alignments3D'}], 'title': 'Particle stacks', 'type': 'particle'}, {'connections': [{'group_name': 'volume', 'job_uid': 'J292', 'slots': [{'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'map', 'result_type': 'volume.blob', 'slot_name': 'map', 'version': 'F'}, {'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'map_sharp', 'result_type': 'volume.blob', 'slot_name': None, 'version': 'F'}, {'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'map_half_A', 'result_type': 'volume.blob', 'slot_name': None, 'version': 'F'}, {'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'map_half_B', 'result_type': 'volume.blob', 'slot_name': None, 'version': 'F'}, {'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'mask_refine', 'result_type': 'volume.blob', 'slot_name': None, 'version': 'F'}, {'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'mask_fsc', 'result_type': 'volume.blob', 'slot_name': None, 'version': 'F'}, {'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'mask_fsc_auto', 'result_type': 'volume.blob', 'slot_name': None, 'version': 'F'}, {'group_name': 'volume', 'job_uid': 'J292', 'result_name': 'precision', 'result_type': 'volume.blob', 'slot_name': None, 'version': 'F'}]}], 'count_max': 1, 'count_min': 1, 'description': '', 'name': 'volume', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'map', 'optional': False, 'title': 'Initial volume raw data', 'type': 'volume.blob'}], 'title': 'Initial volume', 'type': 'volume'}, {'connections': [], 'count_max': 1, 'count_min': 0, 'description': '', 'name': 'mask', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'mask', 'optional': False, 'title': 'Static mask', 'type': 'volume.blob'}], 'title': 'Static mask', 'type': 'mask'}], 'instance_information': {'CUDA_version': '11.8', 'available_memory': '90.15GB', 'cpu_model': 'AMD Ryzen 9 5950X 16-Core Processor', 'driver_version': '12.4', 'gpu_info': [{'id': 0, 'mem': 25430786048, 'name': 'NVIDIA GeForce RTX 3090', 'pcie': '0000:04:00'}, {'id': 1, 'mem': 25429671936, 'name': 'NVIDIA GeForce RTX 3090', 'pcie': '0000:0b:00'}], 'ofd_hard_limit': 1048576, 'ofd_soft_limit': 1024, 'physical_cores': 16, 'platform_architecture': 'x86_64', 'platform_node': 'smbl1', 'platform_release': '6.8.0-49-generic', 'platform_version': '#49-Ubuntu SMP PREEMPT_DYNAMIC Mon Nov  4 02:06:24 UTC 2024', 'total_memory': '94.20GB', 'used_memory': '3.06GB'}, 'job_type': 'nonuniform_refine_new', 'params_spec': {'refine_ctf_global_refine': {'value': True}, 'refine_defocus_refine': {'value': True}, 'refine_symmetry': {'value': 'D2'}}, 'project_uid': 'P1', 'status': 'completed', 'uid': 'J295', 'version': 'v4.6.0'}

**uddipan@smbl1**:~/Desktop$ cryosparcm cli "get_job('**P1', 'J446',** 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run', 'input_slot_groups')"
{'_id': '67b800c1cad9b55803e16b41', 'errors_run': [], 'input_slot_groups': [{'connections': [{'group_name': 'particles_class_0', 'job_uid': 'J444', 'slots': [{'group_name': 'particles_class_0', 'job_uid': 'J444', 'result_name': 'blob', 'result_type': 'particle.blob', 'slot_name': 'blob', 'version': 'F'}, {'group_name': 'particles_class_0', 'job_uid': 'J444', 'result_name': 'ctf', 'result_type': 'particle.ctf', 'slot_name': 'ctf', 'version': 'F'}, {'group_name': 'particles_class_0', 'job_uid': 'J444', 'result_name': 'alignments3D', 'result_type': 'particle.alignments3D', 'slot_name': 'alignments3D', 'version': 'F'}, {'group_name': 'particles_class_0', 'job_uid': 'J444', 'result_name': 'location', 'result_type': 'particle.location', 'slot_name': None, 'version': 'F'}, {'group_name': 'particles_class_0', 'job_uid': 'J444', 'result_name': 'pick_stats', 'result_type': 'particle.pick_stats', 'slot_name': None, 'version': 'F'}, {'group_name': 'particles_class_0', 'job_uid': 'J444', 'result_name': 'motion', 'result_type': 'particle.motion', 'slot_name': None, 'version': 'F'}, {'group_name': 'particles_class_0', 'job_uid': 'J444', 'result_name': 'sym_expand', 'result_type': 'particle.sym_expand', 'slot_name': None, 'version': 'F'}]}], 'count_max': inf, 'count_min': 1, 'description': 'Particle stacks to use. Multiple stacks will be concatenated.', 'name': 'particles', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'blob', 'optional': False, 'title': 'Particle data blobs', 'type': 'particle.blob'}, {'description': '', 'name': 'ctf', 'optional': False, 'title': 'Particle ctf parameters', 'type': 'particle.ctf'}, {'description': '', 'name': 'alignments3D', 'optional': False, 'title': 'Particle 3D alignments', 'type': 'particle.alignments3D'}], 'title': 'Particle stacks', 'type': 'particle'}, {'connections': [{'group_name': 'volume_class_0', 'job_uid': 'J444', 'slots': [{'group_name': 'volume_class_0', 'job_uid': 'J444', 'result_name': 'map', 'result_type': 'volume.blob', 'slot_name': 'map', 'version': 'F'}]}], 'count_max': 1, 'count_min': 1, 'description': '', 'name': 'volume', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'map', 'optional': False, 'title': 'Initial volume raw data', 'type': 'volume.blob'}], 'title': 'Initial volume', 'type': 'volume'}, {'connections': [{'group_name': 'mask_focus', 'job_uid': 'J444', 'slots': [{'group_name': 'mask_focus', 'job_uid': 'J444', 'result_name': 'mask', 'result_type': 'volume.blob', 'slot_name': 'mask', 'version': 'F'}]}], 'count_max': 1, 'count_min': 1, 'description': '', 'name': 'mask', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'mask', 'optional': False, 'title': 'Static mask', 'type': 'volume.blob'}], 'title': 'Static mask', 'type': 'mask'}], 'instance_information': {'CUDA_version': '11.8', 'available_memory': '89.69GB', 'cpu_model': 'AMD Ryzen 9 5950X 16-Core Processor', 'driver_version': '12.4', 'gpu_info': [{'id': 0, 'mem': 25430786048, 'name': 'NVIDIA GeForce RTX 3090', 'pcie': '0000:04:00'}, {'id': 1, 'mem': 25429671936, 'name': 'NVIDIA GeForce RTX 3090', 'pcie': '0000:0b:00'}], 'ofd_hard_limit': 1048576, 'ofd_soft_limit': 1024, 'physical_cores': 16, 'platform_architecture': 'x86_64', 'platform_node': 'smbl1', 'platform_release': '6.8.0-52-generic', 'platform_version': '#53-Ubuntu SMP PREEMPT_DYNAMIC Sat Jan 11 00:06:25 UTC 2025', 'total_memory': '94.20GB', 'used_memory': '3.54GB'}, 'job_type': 'new_local_refine', 'params_spec': {'reinitialize_rs': {'value': True}, 'reinitialize_ss': {'value': True}, 'use_alignment_prior': {'value': True}}, 'project_uid': 'P1', 'status': 'completed', 'uid': 'J446', 'version': 'v4.6.2'}

@wtempel kindly suggest if i need to reinstall cryosparc. But for that, how do i get all my finished projects and the workspaces all in place as previous after reinstall?

There may have been a mix-up: J446 is not a NU refinement job. Please can you double-check the job ID and post the output for the failed NU refinement job.

The evidence so far does not suggest to me that the problem could be corrected by a reinstallation of CryoSPARC.

@wtempel, whether it is Nu refine or local refine, everything is failing after few minutes giving the error. Everything was fine, just the PSU was swapped from the machine.

Please can you post for one such failed job:

csprojectid=P1 # ensure correct project ID
csjobid=J199 # replace with id of the failed job in the first post
cryosparcm joblog $csprojectid $csjobid | tail -n 40
cryosparcm eventlog $csprojectid $csjobid | tail -n 60
cryosparcm cli "get_job('$csprojectid', '$csjobid', 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run')"

Is the particle box size for the failing job also 360, as mentioned in your first post?

@wtempel here it is…

uddipan@smbl1:~$ cryosparcm joblog P1 J473 | tail -n 40
type R2C
wkspc automatic
Python traceback:

:1: UserWarning: Cannot manually free CUDA array; will be freed when garbage collected
========= sending heartbeat at 2025-03-12 13:42:03.931654
========= sending heartbeat at 2025-03-12 13:42:13.944716
========= sending heartbeat at 2025-03-12 13:42:23.958582
========= sending heartbeat at 2025-03-12 13:42:33.975716
========= sending heartbeat at 2025-03-12 13:42:43.988757
========= sending heartbeat at 2025-03-12 13:42:54.009172
**custom thread exception hook caught something
**** handle exception rc
Traceback (most recent call last):
File “/home/uddipan/apps/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py”, line 2304, in run_with_except_hook
run_old(*args, **kw)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/threading.py”, line 953, in run
self._target(*self._args, **self._kwargs)
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2730, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2913, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 763, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.compute_ctf
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 382, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py”, line 232, in _require_cuda_context
return fn(*args, **kws)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/api.py”, line 189, in pinned_array
buffer = current_context().memhostalloc(bytesize)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1378, in memhostalloc
return self.memory_manager.memhostalloc(bytesize, mapped, portable, wc)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 889, in memhostalloc
pointer = allocator()
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 884, in allocator
return driver.cuMemHostAlloc(size, flags)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE
set status to failed
========= main process now complete at 2025-03-12 13:43:04.022623.
========= monitor process now complete at 2025-03-12 13:43:04.024466.

uddipan@smbl1:~$ cryosparcm eventlog P1 J473 | tail -n 60
[Wed, 12 Mar 2025 08:08:35 GMT] [CPU RAM used: 16198 MB] Using full box size 360, downsampled box size 180, with low memory mode disabled.
[Wed, 12 Mar 2025 08:08:35 GMT] [CPU RAM used: 16198 MB] Computing FFTs on GPU.
[Wed, 12 Mar 2025 08:08:37 GMT] [CPU RAM used: 16198 MB] Done in 1.925s
[Wed, 12 Mar 2025 08:08:37 GMT] [CPU RAM used: 16198 MB] Computing cFSCs…
[Wed, 12 Mar 2025 08:08:39 GMT] [CPU RAM used: 16197 MB] Done in 1.948s
[Wed, 12 Mar 2025 08:08:39 GMT] [CPU RAM used: 16197 MB] Using Filter Radius 71.129 (4.353A) | Previous: 69.854 (4.432A)
[Wed, 12 Mar 2025 08:08:44 GMT] [CPU RAM used: 15123 MB] Non-uniform regularization with compute option: GPU
[Wed, 12 Mar 2025 08:08:44 GMT] [CPU RAM used: 15123 MB] Running local cross validation for A …
[Wed, 12 Mar 2025 08:09:21 GMT] [CPU RAM used: 15488 MB] Local cross validation A done in 36.678s
[Wed, 12 Mar 2025 08:09:22 GMT] FSC Filtered Side A
[Wed, 12 Mar 2025 08:09:22 GMT] CV Filtered Side A
[Wed, 12 Mar 2025 08:09:22 GMT] [CPU RAM used: 15488 MB] Running local cross validation for B …
[Wed, 12 Mar 2025 08:09:59 GMT] [CPU RAM used: 15844 MB] Local cross validation B done in 36.606s
[Wed, 12 Mar 2025 08:09:59 GMT] FSC Filtered Side B
[Wed, 12 Mar 2025 08:10:00 GMT] CV Filtered Side B
[Wed, 12 Mar 2025 08:10:08 GMT] [CPU RAM used: 15828 MB] Estimated Bfactor: -35.8
[Wed, 12 Mar 2025 08:10:08 GMT] [CPU RAM used: 15828 MB] Plotting…
[Wed, 12 Mar 2025 08:10:14 GMT] Real Space Slices Iteration 002
[Wed, 12 Mar 2025 08:10:15 GMT] Fourier Space Slices Iteration 002
[Wed, 12 Mar 2025 08:10:16 GMT] Real Space Mask Slices Iteration 002
[Wed, 12 Mar 2025 08:10:17 GMT] FSC Iteration 002
[Wed, 12 Mar 2025 08:10:17 GMT] cFSCs (Half-angle: 20°) Iteration 002, with tight mask
[Wed, 12 Mar 2025 08:10:22 GMT] Guinier Plot Iteration 002
[Wed, 12 Mar 2025 08:10:22 GMT] Noise Model Iteration 002
[Wed, 12 Mar 2025 08:10:24 GMT] Viewing Direction Distribution Iteration 002
[Wed, 12 Mar 2025 08:10:25 GMT] Posterior Precision Directional Distribution Iteration 002
[Wed, 12 Mar 2025 08:10:25 GMT] Pose change after symmetry relaxation, iteration 002
[Wed, 12 Mar 2025 08:10:25 GMT] [CPU RAM used: 15876 MB] Done in 16.320s.
[Wed, 12 Mar 2025 08:10:25 GMT] [CPU RAM used: 15876 MB] Outputting files…
[Wed, 12 Mar 2025 08:10:33 GMT] [CPU RAM used: 15757 MB] Done in 7.857s.
[Wed, 12 Mar 2025 08:10:33 GMT] [CPU RAM used: 15757 MB] Done iteration 2 in 174.540s. Total time so far 545.456s
[Wed, 12 Mar 2025 08:10:33 GMT] [CPU RAM used: 15757 MB] ----------------------------- Start Iteration 3
[Wed, 12 Mar 2025 08:10:33 GMT] [CPU RAM used: 15757 MB] Using Max Alignment Radius 30.960 (10.000A)
[Wed, 12 Mar 2025 08:10:33 GMT] [CPU RAM used: 15757 MB] Using Full Dataset (split 267914 in A, 268489 in B)
[Wed, 12 Mar 2025 08:10:33 GMT] [CPU RAM used: 15757 MB] Using dynamic mask.
[Wed, 12 Mar 2025 08:10:57 GMT] [CPU RAM used: 15757 MB] – THR 0 BATCH 500 NUM 500 TOTAL 1.7740104 ELAPSED 120.10687 –
[Wed, 12 Mar 2025 08:12:57 GMT] [CPU RAM used: 17908 MB] Traceback (most recent call last):
File “/home/uddipan/apps/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py”, line 2304, in run_with_except_hook
run_old(*args, **kw)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/threading.py”, line 953, in run
self._target(*self._args, **self._kwargs)
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2730, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2913, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 763, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.compute_ctf
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 382, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py”, line 232, in _require_cuda_context
return fn(*args, **kws)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/api.py”, line 189, in pinned_array
buffer = current_context().memhostalloc(bytesize)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1378, in memhostalloc
return self.memory_manager.memhostalloc(bytesize, mapped, portable, wc)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 889, in memhostalloc
pointer = allocator()
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 884, in allocator
return driver.cuMemHostAlloc(size, flags)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE

uddipan@smbl1:~$ cryosparcm cli “get_job(‘P1’, ‘J473’, ‘job_type’, ‘version’, ‘instance_information’, ‘status’, ‘params_spec’, ‘errors_run’)”
{‘_id’: ‘67cc024192ecfc9e2ecf1243’, ‘errors_run’: [{‘message’: ‘[CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE’, ‘warning’: False}], ‘instance_information’: {‘CUDA_version’: ‘11.8’, ‘available_memory’: ‘121.30GB’, ‘cpu_model’: ‘AMD Ryzen 9 5950X 16-Core Processor’, ‘driver_version’: ‘12.2’, ‘gpu_info’: [{‘id’: 0, ‘mem’: 25438126080, ‘name’: ‘NVIDIA GeForce RTX 3090’, ‘pcie’: ‘0000:04:00’}, {‘id’: 1, ‘mem’: 25437011968, ‘name’: ‘NVIDIA GeForce RTX 3090’, ‘pcie’: ‘0000:0b:00’}], ‘ofd_hard_limit’: 1048576, ‘ofd_soft_limit’: 1024, ‘physical_cores’: 16, ‘platform_architecture’: ‘x86_64’, ‘platform_node’: ‘smbl1’, ‘platform_release’: ‘6.11.0-19-generic’, ‘platform_version’: ‘#19~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Feb 17 11:51:52 UTC 2’, ‘total_memory’: ‘125.70GB’, ‘used_memory’: ‘3.23GB’}, ‘job_type’: ‘nonuniform_refine_new’, ‘params_spec’: {‘refine_num_final_iterations’: {‘value’: 10}, ‘refine_relax_symmetry’: {‘value’: ‘maximization’}, ‘refine_res_align_max’: {‘value’: 10}, ‘refine_res_init’: {‘value’: 15}, ‘refine_symmetry’: {‘value’: ‘D2’}}, ‘project_uid’: ‘P1’, ‘status’: ‘failed’, ‘uid’: ‘J473’, ‘version’: ‘v4.6.2’}

**Even the deepEMhancer which used to run from within cryosparc interface is not running. I guess there is some fatal problem in the installed cryosparc. So if i delete cs_worker and cs_master keeping just the cs_database folder and install new cs_master and cs_worker in the same folder, will i get the projects and work spaces all in place like the previous one?

Yes the particle box size is 360.

I am not convinced. Have there been any manual changes to the cryosparc_worker/ installation? Are you in a position to try if a reboot of smbl1 resolves the issue?

@wtempel i have installed latest nvidia 570 and cuda tool kit 12 today and rebooted multiple times (not an issue as its used by our single lab). But nothing works as of now.

Please can you post the outputs of these commands

uname -a 
free -h
nvidia-smi
/home/uddipan/apps/cryosparc/cryosparc_worker/bin/cryosparcw call env | grep PATH

@wtempel Here it is…

uddipan@smbl1:~/Desktop$ uname -a
free -h
nvidia-smi
/home/uddipan/apps/cryosparc/cryosparc_worker/bin/cryosparcw call env | grep PATH
Linux smbl1 6.11.0-19-generic #19~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Feb 17 11:51:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
total used free shared buff/cache available
Mem: 125Gi 3.2Gi 120Gi 17Mi 3.3Gi 122Gi
Swap: 8.0Gi 0B 8.0Gi
Thu Mar 13 09:30:25 2025
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.06 Driver Version: 570.124.06 CUDA Version: 12.8 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:04:00.0 Off | N/A |
| 0% 30C P8 17W / 350W | 15MiB / 24576MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
| 1 NVIDIA GeForce RTX 3090 Off | 00000000:0B:00.0 On | N/A |
| 0% 30C P8 14W / 350W | 424MiB / 24576MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 4480 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 4480 G /usr/lib/xorg/Xorg 113MiB |
| 1 N/A N/A 5169 G /usr/bin/gnome-shell 87MiB |
| 1 N/A N/A 6832 G …/5889/usr/lib/firefox/firefox 168MiB |
±----------------------------------------------------------------------------------------+
CRYOSPARC_PATH=/home/uddipan/apps/cryosparc/cryosparc_worker/bin
WINDOWPATH=2
PYTHONPATH=/home/uddipan/apps/cryosparc/cryosparc_worker
NUMBA_CUDA_INCLUDE_PATH=/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/include
LIBTBX_OPATH=
LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:
PATH=/home/uddipan/apps/cryosparc/cryosparc_worker/bin:/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/condabin:/home/uddipan/apps/ccpem/bin:/opt/xtal/arp_warp_8.0/bin/bin-x86_64-Linux:/opt/xtal/ccp4-9/etc:/opt/xtal/ccp4-9/bin:/usr/lib/nvidia-cuda-toolkit/bin:/home/uddipan/Downloads/phenix/=/home/uddipan/apps/phenix/phenix-1.21.2-5419/build/bin:/home/uddipan/apps/miniconda3/condabin:/home/uddipan/apps/cryosparc/cryosparc_master/bin:/home/uddipan/.local/bin:/usr/local/IMOD/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/IMOD/pythonLink:/home/uddipan/apps/pymol:/home/uddipan/apps/ccpem

Thank @Das .
Does adding the line

unset LD_LIBRARY_PATH

to the file

/home/uddipan/apps/cryosparc/cryosparc_worker/config.sh

make a difference?

@wtempel no it did not help.
I have done a fresh install and was able to rescue the database correctly. I could see all my project in place. I have not installed any cuda toolkit this time as cryosparc is bundled with its own cuda. Currently running only on nvidia driver 550.120.
The process is failing again with the following error.

Traceback (most recent call last):
File “/home/uddipan/apps/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py”, line 2304, in run_with_except_hook
run_old(*args, **kw)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/threading.py”, line 953, in run
self._target(*self._args, **self._kwargs)
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2730, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2911, in cryosparc_master.cryosparc_compute.engine.newengine.process.work
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 714, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.extract_data
File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 382, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py”, line 232, in _require_cuda_context
return fn(*args, **kws)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/api.py”, line 189, in pinned_array
buffer = current_context().memhostalloc(bytesize)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1378, in memhostalloc
return self.memory_manager.memhostalloc(bytesize, mapped, portable, wc)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 889, in memhostalloc
pointer = allocator()
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 884, in allocator
return driver.cuMemHostAlloc(size, flags)
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
File “/home/uddipan/apps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_INVALID_VALUE] Call to cuMemHostAlloc results in CUDA_ERROR_INVALID_VALUE

Finally solved this by adding
export CRYOSPARC_NO_PAGELOCK=true in the config.sh file in the worker folder.