Child process with PID terminated unexpectedly on v4.5

Hi, I got the following error while running Patch Motion Correction on the newly updated cryosparc v4.5:

Child process with PID 8048 terminated unexpectedly with exit code 1.

Then I was trying to update the cuda 11.8 with
$ /home/impmc/cryosparc_worker/bin/cryosparcw newcuda /home/impmc/cuda-11.8.0
by following this thread:

however, the system returns me an error:

Unknown cryosparcw command newcuda

It seems like the cryosparc v>4.4 bundles CUDA 11.8 in already.
I’m wondering how can I fix the problem. Thank you.

My Event Log and Metadata Log are as follows.

The Event Log:

License is valid.
Launching job on lane default target syrah.qb3.berkeley.edu ...
Running job on master node hostname syrah.qb3.berkeley.edu
[CPU:   92.5 MB  Avail: 253.61 GB] Job J3 Started
[CPU:   92.5 MB  Avail: 253.61 GB] Master running v4.5.0, worker running v4.5.0
[CPU:   92.8 MB  Avail: 253.61 GB] Working in directory: /mount/local3/minghao/C1-NRBF2-dimer_processing/CS-c1-nrbf2-dimer/J3
[CPU:   92.8 MB  Avail: 253.61 GB] Running on lane default
[CPU:   92.8 MB  Avail: 253.61 GB] Resources allocated: 
[CPU:   92.8 MB  Avail: 253.61 GB]   Worker:  syrah.qb3.berkeley.edu
[CPU:   92.8 MB  Avail: 253.61 GB]   CPU   :  [2, 3, 4, 5, 6, 7]
[CPU:   92.8 MB  Avail: 253.61 GB]   GPU   :  [1]
[CPU:   92.8 MB  Avail: 253.61 GB]   RAM   :  [1, 2]
[CPU:   92.8 MB  Avail: 253.61 GB]   SSD   :  False
[CPU:   92.8 MB  Avail: 253.61 GB] --------------------------------------------------------------
[CPU:   92.8 MB  Avail: 253.61 GB] Importing job module for job type patch_motion_correction_multi...
[CPU:  248.7 MB  Avail: 253.41 GB] Job ready to run
[CPU:  248.7 MB  Avail: 253.41 GB] ***************************************************************
[CPU:  248.8 MB  Avail: 253.41 GB] Job will process this many movies:  10
[CPU:  248.8 MB  Avail: 253.41 GB] Job will output denoiser training data for this many movies:  10
[CPU:  248.8 MB  Avail: 253.41 GB] Random seed: 1389188080
[CPU:  248.8 MB  Avail: 253.41 GB] parent process is 8006
[CPU:  163.8 MB  Avail: 253.40 GB] Calling CUDA init from 8048
[CPU:  249.3 MB  Avail: 253.42 GB] Child process with PID 8048 terminated unexpectedly with exit code 1.
[CPU:  249.3 MB  Avail: 253.42 GB] ['uid', 'movie_blob/path', 'movie_blob/shape', 'movie_blob/psize_A', 'movie_blob/is_gain_corrected', 'movie_blob/format', 'movie_blob/has_defect_file', 'movie_blob/import_sig', 'micrograph_blob/path', 'micrograph_blob/idx', 'micrograph_blob/shape', 'micrograph_blob/psize_A', 'micrograph_blob/format', 'micrograph_blob/is_background_subtracted', 'micrograph_blob/vmin', 'micrograph_blob/vmax', 'micrograph_blob/import_sig', 'micrograph_blob_non_dw/path', 'micrograph_blob_non_dw/idx', 'micrograph_blob_non_dw/shape', 'micrograph_blob_non_dw/psize_A', 'micrograph_blob_non_dw/format', 'micrograph_blob_non_dw/is_background_subtracted', 'micrograph_blob_non_dw/vmin', 'micrograph_blob_non_dw/vmax', 'micrograph_blob_non_dw/import_sig', 'micrograph_blob_non_dw_AB/path', 'micrograph_blob_non_dw_AB/idx', 'micrograph_blob_non_dw_AB/shape', 'micrograph_blob_non_dw_AB/psize_A', 'micrograph_blob_non_dw_AB/format', 'micrograph_blob_non_dw_AB/is_background_subtracted', 'micrograph_blob_non_dw_AB/vmin', 'micrograph_blob_non_dw_AB/vmax', 'micrograph_blob_non_dw_AB/import_sig', 'micrograph_thumbnail_blob_1x/path', 'micrograph_thumbnail_blob_1x/idx', 'micrograph_thumbnail_blob_1x/shape', 'micrograph_thumbnail_blob_1x/format', 'micrograph_thumbnail_blob_1x/binfactor', 'micrograph_thumbnail_blob_1x/micrograph_path', 'micrograph_thumbnail_blob_1x/vmin', 'micrograph_thumbnail_blob_1x/vmax', 'micrograph_thumbnail_blob_2x/path', 'micrograph_thumbnail_blob_2x/idx', 'micrograph_thumbnail_blob_2x/shape', 'micrograph_thumbnail_blob_2x/format', 'micrograph_thumbnail_blob_2x/binfactor', 'micrograph_thumbnail_blob_2x/micrograph_path', 'micrograph_thumbnail_blob_2x/vmin', 'micrograph_thumbnail_blob_2x/vmax', 'background_blob/path', 'background_blob/idx', 'background_blob/binfactor', 'background_blob/shape', 'background_blob/psize_A', 'rigid_motion/type', 'rigid_motion/path', 'rigid_motion/idx', 'rigid_motion/frame_start', 'rigid_motion/frame_end', 'rigid_motion/zero_shift_frame', 'rigid_motion/psize_A', 'spline_motion/type', 'spline_motion/path', 'spline_motion/idx', 'spline_motion/frame_start', 'spline_motion/frame_end', 'spline_motion/zero_shift_frame', 'spline_motion/psize_A']
[CPU:  249.4 MB  Avail: 253.41 GB] --------------------------------------------------------------
[CPU:  249.4 MB  Avail: 253.41 GB] Compiling job outputs...
[CPU:  249.4 MB  Avail: 253.41 GB] Passing through outputs for output group micrographs from input group movies
[CPU:  249.4 MB  Avail: 253.41 GB] This job outputted results ['micrograph_blob_non_dw', 'micrograph_blob_non_dw_AB', 'micrograph_thumbnail_blob_1x', 'micrograph_thumbnail_blob_2x', 'movie_blob', 'micrograph_blob', 'background_blob', 'rigid_motion', 'spline_motion']
[CPU:  249.4 MB  Avail: 253.41 GB]   Loaded output dset with 0 items
[CPU:  249.4 MB  Avail: 253.41 GB] Passthrough results ['gain_ref_blob', 'mscope_params']
[CPU:  249.4 MB  Avail: 253.41 GB]   Loaded passthrough dset with 10 items
[CPU:  249.4 MB  Avail: 253.41 GB]   Intersection of output and passthrough has 0 items
[CPU:  249.4 MB  Avail: 253.41 GB]   Output dataset contains:  ['gain_ref_blob', 'mscope_params']
[CPU:  249.4 MB  Avail: 253.41 GB]   Outputting passthrough result gain_ref_blob
[CPU:  249.4 MB  Avail: 253.41 GB]   Outputting passthrough result mscope_params
[CPU:  249.4 MB  Avail: 253.41 GB] Passing through outputs for output group micrographs_incomplete from input group movies
[CPU:  249.4 MB  Avail: 253.41 GB] This job outputted results ['micrograph_blob']
[CPU:  249.4 MB  Avail: 253.41 GB]   Loaded output dset with 10 items
[CPU:  249.4 MB  Avail: 253.41 GB] Passthrough results ['movie_blob', 'gain_ref_blob', 'mscope_params']
[CPU:  249.4 MB  Avail: 253.41 GB]   Loaded passthrough dset with 10 items
[CPU:  249.4 MB  Avail: 253.41 GB]   Intersection of output and passthrough has 10 items
[CPU:  249.4 MB  Avail: 253.41 GB]   Output dataset contains:  ['gain_ref_blob', 'movie_blob', 'mscope_params']
[CPU:  249.4 MB  Avail: 253.41 GB]   Outputting passthrough result movie_blob
[CPU:  249.4 MB  Avail: 253.41 GB]   Outputting passthrough result gain_ref_blob
[CPU:  249.4 MB  Avail: 253.41 GB]   Outputting passthrough result mscope_params
[CPU:  249.4 MB  Avail: 253.41 GB] Checking outputs for output group micrographs
[CPU:  249.4 MB  Avail: 253.41 GB] Checking outputs for output group micrographs_incomplete
[CPU:  249.7 MB  Avail: 253.41 GB] Updating job size...
[CPU:  249.7 MB  Avail: 253.41 GB] Exporting job and creating csg files...
[CPU:  249.7 MB  Avail: 253.41 GB] ***************************************************************
[CPU:  249.7 MB  Avail: 253.41 GB] Job complete. Total time 30.78s

The Metadata Log:

================= CRYOSPARCW =======  2024-05-07 16:23:43.531641  =========
Project P83 Job J3
Master syrah.qb3.berkeley.edu Port 39002
===========================================================================
MAIN PROCESS PID 8006
========= now starting main process at 2024-05-07 16:23:43.532445
motioncorrection.run_patch cryosparc_compute.jobs.jobregister
MONITOR PROCESS PID 8008
========= monitor process now waiting for main process
========= sending heartbeat at 2024-05-07 16:23:45.432702
/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numpy/core/getlimits.py:499: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numpy/core/getlimits.py:499: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
***************************************************************
Running job on hostname %s syrah.qb3.berkeley.edu
Allocated Resources :  {'fixed': {'SSD': False}, 'hostname': 'syrah.qb3.berkeley.edu', 'lane': 'default', 'lane_type': 'node', 'license': True, 'licenses_acquired': 1, 'slots': {'CPU': [2, 3, 4, 5, 6, 7], 'GPU': [1], 'RAM': [1, 2]}, 'target': {'cache_path': '/mount/ssd/cryosparc2_cache', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 11554717696, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 1, 'mem': 11554717696, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 2, 'mem': 11554717696, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 3, 'mem': 11554717696, 'name': 'NVIDIA GeForce RTX 2080 Ti'}], 'hostname': 'syrah.qb3.berkeley.edu', 'lane': 'default', 'name': 'syrah.qb3.berkeley.edu', 'resource_fixed': {'SSD': True}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0, 1, 2, 3], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, 'ssh_str': 'cryosparc2', 'title': 'Worker node syrah.qb3.berkeley.edu', 'type': 'node', 'worker_bin_path': '/home/cryosparc2/cryosparc2/cryosparc2_worker/bin/cryosparcw'}}
Process Process-1:
Traceback (most recent call last):
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/cryosparc_compute/jobs/pipeline.py", line 199, in process_work_simple
    process_setup(proc_idx) # do any setup you want on a per-process basis
  File "cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py", line 115, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.process_setup
  File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 47, in cryosparc_master.cryosparc_compute.gpu.gpucore.initialize
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 3216, in get_version
    return driver.get_version()
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 461, in get_version
    version = driver.cuDriverGetVersion()
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 292, in __getattr__
    self.ensure_initialized()
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 254, in ensure_initialized
    self.cuInit(0)
  File "/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 348, in safe_cuda_api_call
    return self._check_cuda_python_error(fname, libfn(*args))
  File "cuda/cuda.pyx", line 11325, in cuda.cuda.cuInit
  File "cuda/ccuda.pyx", line 17, in cuda.ccuda.cuInit
  File "cuda/_cuda/ccuda.pyx", line 2353, in cuda._cuda.ccuda._cuInit
RuntimeError: Function "cuInit" not found
========= sending heartbeat at 2024-05-07 16:23:55.449315
========= sending heartbeat at 2024-05-07 16:24:05.469318
========= sending heartbeat at 2024-05-07 16:24:15.493143
/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numpy/core/fromnumeric.py:3474: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
***************************************************************
========= main process now complete at 2024-05-07 16:24:24.817011
========= sending heartbeat at 2024-05-07 16:24:25.516081
  ========= heartbeat failed at 2024-05-07 16:24:25.529758: 
========= main process now complete at 2024-05-07 16:24:35.538837.
========= monitor process now complete at 2024-05-07 16:24:35.547084.

Thanks @Minghao for this report.
Please can you run the following commands on the computer syrah.qb3.berkeley.edu, post their outputs here:

nvidia-smi
/home/cryosparc2/cryosparc2/cryosparc2_worker/bin/cryosparcw call numba -s
/home/cryosparc2/cryosparc2/cryosparc2_worker/bin/cryosparcw call python -c "from numba import cuda; cuda.cudadrv.libs.test()"
LD_DEBUG=libs /home/cryosparc2/cryosparc2/cryosparc2_worker/bin/cryosparcw call python -c "from cuda import cuda; cuda.cuInit(0)" 2> /tmp/libsdebug-20240508.txt

and email us the file libsdebug-20240508.txt, which you should find in the /tmp/ directory of syrah.qb3.berkeley.edu after running the last command.
I will send you a private message with the email address.

Thank you @wtempel. Here are the outputs:

$nvidia-smi

Wed May  8 12:33:52 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:18:00.0 Off |                  N/A |
| 28%   32C    P8    16W / 250W |     51MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  On   | 00000000:3B:00.0 Off |                  N/A |
| 28%   33C    P8    17W / 250W |      8MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce ...  On   | 00000000:86:00.0 Off |                  N/A |
| 28%   31C    P8    20W / 250W |      8MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA GeForce ...  On   | 00000000:AF:00.0 Off |                  N/A |
| 28%   32C    P8     1W / 250W |      8MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     14498      G   /usr/lib/xorg/Xorg                 49MiB |
|    1   N/A  N/A     14498      G   /usr/lib/xorg/Xorg                  6MiB |
|    2   N/A  N/A     14498      G   /usr/lib/xorg/Xorg                  6MiB |
|    3   N/A  N/A     14498      G   /usr/lib/xorg/Xorg                  6MiB |
+-----------------------------------------------------------------------------+

$/home/cryosparc2/cryosparc2/cryosparc2_worker/bin/cryosparcw call numba -s

System info:
--------------------------------------------------------------------------------
__Time Stamp__
Report started (local time)                   : 2024-05-08 12:35:35.183457
UTC start time                                : 2024-05-08 19:35:35.183461
Running time (s)                              : 9.372657

__Hardware Information__
Machine                                       : x86_64
CPU Name                                      : skylake-avx512
CPU Count                                     : 48
Number of accessible CPUs                     : 48
List of accessible CPUs cores                 : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
CFS Restrictions (CPUs worth of runtime)      : None

CPU Features                                  : 64bit adx aes avx avx2 avx512bw
                                                avx512cd avx512dq avx512f avx512vl
                                                bmi bmi2 clflushopt clwb cmov
                                                crc32 cx16 cx8 f16c fma fsgsbase
                                                fxsr invpcid lzcnt mmx movbe
                                                pclmul pku popcnt prfchw rdrnd
                                                rdseed rtm sahf sse sse2 sse3
                                                sse4.1 sse4.2 ssse3 xsave xsavec
                                                xsaveopt xsaves

Memory Total (MB)                             : 257604
Memory Available (MB)                         : 253475

__OS Information__
Platform Name                                 : Linux-4.15.0-213-generic-x86_64-with-glibc2.27
Platform Release                              : 4.15.0-213-generic
OS Name                                       : Linux
OS Version                                    : #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023
OS Specific Version                           : ?
Libc Version                                  : glibc 2.27

__Python Information__
Python Compiler                               : GCC 12.3.0
Python Implementation                         : CPython
Python Version                                : 3.10.13
Python Locale                                 : en_US.UTF-8

__Numba Toolchain Versions__
Numba Version                                 : 0.59.0
llvmlite Version                              : 0.42.0

__LLVM Information__
LLVM Version                                  : 14.0.6

__CUDA Information__
CUDA Device Initialized                       : False
CUDA Driver Version                           : ?
CUDA Runtime Version                          : ?
CUDA NVIDIA Bindings Available                : ?
CUDA NVIDIA Bindings In Use                   : ?
CUDA Minor Version Compatibility Available    : ?
CUDA Minor Version Compatibility Needed       : ?
CUDA Minor Version Compatibility In Use       : ?
CUDA Detect Output:
None
CUDA Libraries Test Output:
None

__NumPy Information__
NumPy Version                                 : 1.22.4
NumPy Supported SIMD features                 : ('MMX', 'SSE', 'SSE2', 'SSE3', 'SSSE3', 'SSE41', 'POPCNT', 'SSE42', 'AVX', 'F16C', 'FMA3', 'AVX2', 'AVX512F', 'AVX512CD', 'AVX512VL', 'AVX512BW', 'AVX512DQ', 'AVX512_SKX')
NumPy Supported SIMD dispatch                 : ('SSSE3', 'SSE41', 'POPCNT', 'SSE42', 'AVX', 'F16C', 'FMA3', 'AVX2', 'AVX512F', 'AVX512CD', 'AVX512_KNL', 'AVX512_KNM', 'AVX512_SKX', 'AVX512_CLX', 'AVX512_CNL', 'AVX512_ICL')
NumPy Supported SIMD baseline                 : ('SSE', 'SSE2', 'SSE3')
NumPy AVX512_SKX support detected             : True

__SVML Information__
SVML State, config.USING_SVML                 : False
SVML Library Loaded                           : False
llvmlite Using SVML Patched LLVM              : True
SVML Operational                              : False

__Threading Layer Information__
TBB Threading Layer Available                 : False
+--> Disabled due to Unknown import problem.
OpenMP Threading Layer Available              : True
+-->Vendor: GNU
Workqueue Threading Layer Available           : True
+-->Workqueue imported successfully.

__Numba Environment Variable Information__
NUMBA_CUDA_MAX_PENDING_DEALLOCS_COUNT         : 0
NUMBA_CUDA_USE_NVIDIA_BINDING                 : 1
NUMBA_CUDA_INCLUDE_PATH                       : /home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/include

__Conda Information__
Conda Build                                   : not installed
Conda Env                                     : 24.1.2
Conda Platform                                : linux-64
Conda Python Version                          : 3.10.14.final.0
Conda Root Writable                           : False

__Installed Packages__
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
absl-py                   2.1.0                    pypi_0    pypi
aom                       3.5.0                h27087fc_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
astunparse                1.6.3                    pypi_0    pypi
bcrypt                    3.2.2           py310h5764c6d_1    conda-forge
blinker                   1.7.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.5               h0f2a231_0    conda-forge
brotli                    1.0.9                h166bdaf_9    conda-forge
brotli-bin                1.0.9                h166bdaf_9    conda-forge
brotli-python             1.0.9           py310hd8f1fbe_9    conda-forge
brunsli                   0.1                  h9c3ff4c_0    conda-forge
bzip2                     1.0.8                hd590300_5    conda-forge
c-ares                    1.27.0               hd590300_0    conda-forge
c-blosc2                  2.13.2               hb4ffafa_0    conda-forge
ca-certificates           2024.2.2             hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cachetools                5.3.3                    pypi_0    pypi
certifi                   2024.2.2           pyhd8ed1ab_0    conda-forge
cffi                      1.16.0          py310h2fee648_0    conda-forge
cfitsio                   4.2.0                hd9d235c_0    conda-forge
charls                    2.3.4                h9c3ff4c_0    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
contourpy                 1.2.0           py310hd41b1e2_0    conda-forge
cryptography              42.0.5          py310h75e40e8_0    conda-forge
cuda-cudart               11.8.89                       0    nvidia/label/cuda-11.8.0
cuda-nvrtc                11.8.89                       0    nvidia/label/cuda-11.8.0
cuda-python               11.8.3                   pypi_0    pypi
cuda-version              11.8                 h70ddcb2_3    conda-forge
cycler                    0.12.1             pyhd8ed1ab_0    conda-forge
cython                    3.0.9           py310hc6cd4ac_0    conda-forge
dav1d                     1.2.1                hd590300_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
dnspython                 2.6.1              pyhd8ed1ab_1    conda-forge
exceptiongroup            1.2.0              pyhd8ed1ab_2    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
fftw                      3.3.10          nompi_hc118613_108    conda-forge
filelock                  3.13.1                   pypi_0    pypi
flask                     2.2.3              pyhd8ed1ab_0    conda-forge
flask-jsonrpc             0.3.1                    pypi_0    pypi
flask-pymongo             2.3.0                    pypi_0    pypi
flatbuffers               24.3.7                   pypi_0    pypi
fonttools                 4.49.0          py310h2372a71_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
fsspec                    2024.2.0                 pypi_0    pypi
gast                      0.5.4                    pypi_0    pypi
giflib                    5.2.1                h0b41bf4_3    conda-forge
gmp                       6.3.0                h59595ed_1    conda-forge
google-auth               2.28.2                   pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
grpcio                    1.46.3          py310hba10ccf_0    conda-forge
h5py                      3.10.0          nompi_py310h65828d5_101    conda-forge
hdf5                      1.14.3          nompi_h4f84152_100    conda-forge
idna                      3.6                pyhd8ed1ab_0    conda-forge
imagecodecs               2022.9.26       py310h543e91f_4    conda-forge
imageio                   2.34.0             pyh4b66e23_0    conda-forge
importlib-metadata        7.0.2              pyha770c72_0    conda-forge
ipython                   8.22.2             pyh707e725_0    conda-forge
itsdangerous              2.1.2              pyhd8ed1ab_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.3              pyhd8ed1ab_0    conda-forge
joblib                    1.3.2              pyhd8ed1ab_0    conda-forge
jpeg                      9e                   h0b41bf4_3    conda-forge
jxrlib                    1.1                  hd590300_3    conda-forge
keras                     2.8.0                    pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5           py310hd41b1e2_1    conda-forge
krb5                      1.21.2               h659d440_0    conda-forge
lazy_loader               0.3                pyhd8ed1ab_0    conda-forge
lcms2                     2.14                 h6ed2654_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libaec                    1.1.2                h59595ed_1    conda-forge
libavif                   0.11.1               h8182462_2    conda-forge
libblas                   3.9.0           21_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h166bdaf_9    conda-forge
libbrotlidec              1.0.9                h166bdaf_9    conda-forge
libbrotlienc              1.0.9                h166bdaf_9    conda-forge
libcblas                  3.9.0           21_linux64_openblas    conda-forge
libclang                  16.0.6                   pypi_0    pypi
libcufft                  10.9.0.58                     0    nvidia/label/cuda-11.8.0
libcurand                 10.3.0.86                     0    nvidia/label/cuda-11.8.0
libcurand-dev             10.3.0.86                     0    nvidia/label/cuda-11.8.0
libcurl                   8.5.0                hca28451_0    conda-forge
libcusolver               11.4.1.48                     0    nvidia/label/cuda-11.8.0
libcusparse               11.7.5.86                     0    nvidia/label/cuda-11.8.0
libdeflate                1.14                 h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 13.2.0               h807b86a_5    conda-forge
libgfortran-ng            13.2.0               h69a702a_5    conda-forge
libgfortran5              13.2.0               ha4646dd_5    conda-forge
libgomp                   13.2.0               h807b86a_5    conda-forge
liblapack                 3.9.0           21_linux64_openblas    conda-forge
libllvm14                 14.0.6               hcd5def8_4    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libopenblas               0.3.26          pthreads_h413a1c8_0    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libprotobuf               3.19.4               h780b84a_0    conda-forge
libsqlite                 3.45.2               h2797004_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              13.2.0               h7e041cc_5    conda-forge
libtiff                   4.4.0                h82bc61c_5    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libwebp-base              1.3.2                hd590300_0    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libzlib                   1.2.13               hd590300_5    conda-forge
libzopfli                 1.0.3                h9c3ff4c_0    conda-forge
llvmlite                  0.42.0          py310h1b8f574_1    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
markdown                  3.5.2                    pypi_0    pypi
markupsafe                2.1.5           py310h2372a71_0    conda-forge
matplotlib-base           3.8.3           py310h62c0568_0    conda-forge
matplotlib-inline         0.1.6              pyhd8ed1ab_0    conda-forge
mpmath                    1.3.0                    pypi_0    pypi
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
ncurses                   6.4                  h59595ed_2    conda-forge
networkx                  3.2.1              pyhd8ed1ab_0    conda-forge
numba                     0.59.0          py310h7dc5dd1_1    conda-forge
numpy                     1.22.4          py310h4ef5377_0    conda-forge
oauthlib                  3.2.2              pyhd8ed1ab_0    conda-forge
openjpeg                  2.5.0                h7d73246_1    conda-forge
openssl                   3.2.1                hd590300_0    conda-forge
opt-einsum                3.3.0                    pypi_0    pypi
packaging                 24.0               pyhd8ed1ab_0    conda-forge
pandas                    1.5.3           py310h9b08913_1    conda-forge
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
pbzip2                    1.1.13               h1fcc475_2    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    9.2.0           py310h454ad03_3    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.42             pyha770c72_0    conda-forge
protobuf                  3.19.4          py310h122e73d_0    conda-forge
psutil                    5.9.8           py310h2372a71_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
pyasn1                    0.5.1                    pypi_0    pypi
pyasn1-modules            0.3.0                    pypi_0    pypi
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pycryptodome              3.20.0          py310hb0f0acc_0    conda-forge
pyfftw                    0.12.0          py310h0885ff1_3    conda-forge
pygments                  2.17.2             pyhd8ed1ab_0    conda-forge
pyjwt                     2.8.0              pyhd8ed1ab_1    conda-forge
pylibtiff                 0.4.2           py310he9d7d2b_7    conda-forge
pymongo                   4.6.2           py310hc6cd4ac_0    conda-forge
pyparsing                 3.1.2              pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.10.13         hd12c33a_1_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-slugify            5.0.2              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    4_cp310    conda-forge
pytz                      2024.1             pyhd8ed1ab_0    conda-forge
pywavelets                1.4.1           py310h1f7b6fc_1    conda-forge
pyyaml                    6.0.1           py310h2372a71_1    conda-forge
readline                  8.2                  h8228510_1    conda-forge
requests                  2.29.0             pyhd8ed1ab_0    conda-forge
requests-oauthlib         1.3.1              pyhd8ed1ab_0    conda-forge
requests-toolbelt         0.10.1             pyhd8ed1ab_0    conda-forge
rsa                       4.9                      pypi_0    pypi
scikit-image              0.22.0          py310hcc13569_2    conda-forge
scikit-learn              1.4.1.post1     py310h1fdf081_0    conda-forge
scipy                     1.12.0          py310hb13e2d6_2    conda-forge
semver                    2.13.0             pyh9f0ad1d_0    conda-forge
setuptools                69.2.0             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sleef                     3.5.1                h9b69904_2    conda-forge
snappy                    1.1.10               h9fff704_0    conda-forge
sniffio                   1.3.1              pyhd8ed1ab_0    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
sympy                     1.12                     pypi_0    pypi
tabulate                  0.9.0              pyhd8ed1ab_1    conda-forge
tensorboard               2.8.0                    pypi_0    pypi
tensorboard-data-server   0.6.1           py310h600f1e7_4    conda-forge
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
tensorflow                2.8.4                    pypi_0    pypi
tensorflow-estimator      2.8.0                    pypi_0    pypi
tensorflow-io-gcs-filesystem 0.36.0                   pypi_0    pypi
termcolor                 2.4.0                    pypi_0    pypi
text-unidecode            1.3                pyhd8ed1ab_1    conda-forge
threadpoolctl             3.3.0              pyhc1e730c_0    conda-forge
tifffile                  2022.10.10         pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
torch                     2.1.2+cu118              pypi_0    pypi
traitlets                 5.14.2             pyhd8ed1ab_0    conda-forge
triton                    2.1.0                    pypi_0    pypi
typing-extensions         4.10.0               hd8ed1ab_0    conda-forge
typing_extensions         4.10.0             pyha770c72_0    conda-forge
tzdata                    2024a                h0c530f3_0    conda-forge
unicodedata2              15.1.0          py310h2372a71_0    conda-forge
unidecode                 1.3.8              pyhd8ed1ab_0    conda-forge
urllib3                   1.26.18            pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
werkzeug                  2.3.6              pyhd8ed1ab_0    conda-forge
wheel                     0.42.0             pyhd8ed1ab_0    conda-forge
wrapt                     1.16.0          py310h2372a71_0    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zfp                       1.0.1                h59595ed_0    conda-forge
zipp                      3.17.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               hd590300_5    conda-forge
zlib-ng                   2.0.7                h0b41bf4_0    conda-forge
zstd                      1.5.5                hfc55251_0    conda-forge

No errors reported.


__Warning log__
Warning (cuda): CUDA device initialisation problem. Function "cuInit" not found
Exception class: <class 'RuntimeError'>
--------------------------------------------------------------------------------
If requested, please copy and paste the information between
the dashed (----) lines, or from a given specific section as
appropriate.

=============================================================
IMPORTANT: Please ensure that you are happy with sharing the
contents of the information present, any information that you
wish to keep private you should remove before sharing.
=============================================================

$/home/cryosparc2/cryosparc2/cryosparc2_worker/bin/cryosparcw call python -c “from numba import cuda; cuda.cudadrv.libs.test()”

Finding driver from candidates:
	libcuda.so
	libcuda.so.1
	/usr/lib/libcuda.so
	/usr/lib/libcuda.so.1
	/usr/lib64/libcuda.so
	/usr/lib64/libcuda.so.1
Using loader <class 'ctypes.CDLL'>
	Trying to load driver...	ok
		Loaded from libcuda.so
	Mapped libcuda.so paths:
		/usr/local/cuda-11.8/targets/x86_64-linux/lib/stubs/libcuda.so
Finding nvvm from Conda environment (NVIDIA package)
	Located at /home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/nvvm/lib64/libnvvm.so.4.0.0
	Trying to open library...	ok
Finding nvrtc from Conda environment (NVIDIA package)
	Located at /home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/libnvrtc.so.11.8.89
	Trying to open library...	ok
Finding cudart from Conda environment (NVIDIA package)
	Located at /home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/libcudart.so.11.8.89
	Trying to open library...	ok
Finding cudadevrt from Conda environment (NVIDIA package)
	Located at libcudadevrt.a
	Checking library...	ERROR: failed to find cudadevrt:
libcudadevrt.a not found
Finding libdevice from Conda environment (NVIDIA package)
	Located at /home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/nvvm/libdevice/libdevice.10.bc
	Checking library...	ok

libsdebug-20240508.txt will be sent separately.
Best wishes,

Thanks @Minghao. What is the output of the command

/home/cryosparc2/cryosparc2/cryosparc2_worker/bin/cryosparcw call env | grep PATH

?

Hi, here it is:

LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-9.2/lib64:/usr/local/cuda-8.0/lib64:/usr/local/software/relion/build/lib:/usr/local/software/IMOD/lib:
CRYOSPARC_CUDA_PATH=/usr/local/cuda-10.1
CRYOSPARC_PATH=/home/cryosparc2/cryosparc2/cryosparc2_worker/bin
LIBTBX_OPATH=
PYTHONPATH=/home/cryosparc2/cryosparc2/cryosparc2_worker
NUMBA_CUDA_INCLUDE_PATH=/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/include
PATH=/home/cryosparc2/cryosparc2/cryosparc2_worker/bin:/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/home/cryosparc2/cryosparc2/cryosparc2_worker/deps/anaconda/condabin:/home/minghao/.local/bin:/usr/local/software/phenix-1.18.2-3874/build/bin:/usr/local/cuda/bin:/usr/local/cuda-10.1/bin:/usr/local/cuda-9.2/bin:/usr/local/cuda-8.0/bin:/usr/local/software/relion/build/bin:/usr/local/software/scipion:/usr/local/software/arp_warp_8.0/bin/bin-x86_64-Linux:/usr/local/software/ccp4-7.1/etc:/usr/local/software/ccp4-7.1/bin:/usr/local/software/ccpem-1.5.0/bin:/usr/local/software/bin:/usr/local/software/scripts/bashEM:/usr/local/software/scripts/miscEM:/usr/local/software/IMOD/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

Thanks @Minghao for posting this information and sending us libsdebug-20240508.txt.
What is the output of the command

/sbin/ldconfig -p | grep -u libcuda

?

Yes,

	libcudart.so.11.0 (libc6,x86-64) => /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so.11.0
	libcudart.so.10.1 (libc6,x86-64) => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudart.so.10.1
	libcudart.so (libc6,x86-64) => /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so
	libcudart.so (libc6,x86-64) => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudart.so
	libcudadebugger.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcudadebugger.so.1
	libcuda.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so.1
	libcuda.so.1 (libc6) => /usr/lib/i386-linux-gnu/libcuda.so.1
	libcuda.so (libc6,x86-64) => /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcuda.so
	libcuda.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so
	libcuda.so (libc6) => /usr/lib/i386-linux-gnu/libcuda.so

Thank you.

Interesting. Do you remember how (using which files and commands) you installed

  • the nvidia driver
  • the toolkit in
    /usr/local/cuda-11.8/
    

?

Hi @wtempel ,

Before the updates, we were using CUDA v10.1, Nvidia driver v470, and Cryosparc v4.0.

We first updated the Nvidia driver to v535. $nvidia-smi showed proper GPU readout.

We then updated the Cryosparc to v4.5, but got the “PID terminated unexpectedly” error.

Though Cryosparc v4.4+ comes with cuda11.8, we considered whether adding cuda11.8 to the base system would resolve the gpu error.

We followed the instructions to update CUDA:

During installation, nvidia driver 520.61.05 automatically replaced 535. But the problem hasn’t been solved.

That’s how we got here. Thank you for the help.

@Minghao May I ask

  1. What version of CryoSPARC did you run before v4.5?
  2. What are the outputs of these commands
    apt list --installed | grep -i -e NVIDIA -e CUDA -e NSIGHT
    dpkg -S /usr/local/cuda-11.8
    ls -l /usr/lib/x86_64-linux-gnu/libcuda.so.1
    dpkg -S /usr/lib/x86_64-linux-gnu/libcuda.so.1
    
    ?
  3. How (commands and files) you installed nvidia driver 535, which you mentioned in