CyroSPARC container: /app/cryosparc_worker/bin/cryosparcw connect can't find libcuda

orenshaniathuji · May 2, 2024, 7:01am

Hi All,

I am running CyroSPARC from the 4.3.1 container (I am using Apptainer, not docker), and I get an error while running cryosparcw connect:

Apptainer> /app/cryosparc_worker/bin/cryosparcw connect --worker localhost --master localhost --gpu none
 ---------------------------------------------------------------
  CRYOSPARC CONNECT --------------------------------------------
 ---------------------------------------------------------------
  Attempting to register worker localhost to command localhost:39002
  Connecting as unix user oshani
  Will register using ssh string: oshani@localhost
  If this is incorrect, you should re-run this command with the flag --sshstr <ssh string> 
 ---------------------------------------------------------------
  Connected to master.
 ---------------------------------------------------------------
  Current connected workers:
 ---------------------------------------------------------------
  Autodetecting available GPUs...
Traceback (most recent call last):
  File "bin/connect.py", line 221, in <module>
    gpu_devidxs = check_gpus()
  File "bin/connect.py", line 91, in check_gpus
    num_devs = print_gpu_list()
  File "bin/connect.py", line 23, in print_gpu_list
    import pycuda.driver as cudrv
  File "/app/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/pycuda/driver.py", line 62, in <module>
    from pycuda._driver import *  # noqa
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

It looks like the problem is with LD_LIBRARY_PATH, because cryosparcw env shows: LD_LIBRARY_PATH=/usr/local/cuda/lib64:…

while libcuda is in /usr/local/cuda-11.4/compat/libcuda.so.1

So I guess that I can work around this somehow, but is there an official fix for the problem?

Many thanks

Oren

wtempel · May 7, 2024, 7:30pm

Unfortunately, there is no “official” support for running CryoSPARC in a container.
In case it helps: In a non-container, ubuntu-based installation that I use, the file

/usr/lib/x86_64-linux-gnu/libcuda.so.1

is provided independently from the the CUDA toolset in a package libnvidia-compute-550. I am not sure whether or not that package was installed as part of the nvidia driver installation.
What are the outputs of the commands

nvidia-smi
ls /dev/nvidia?

inside the container?

If you are confident that the LD_LIBRARY_PATH is the only issue, you may try adding this line to cryosparc_worker/config.sh

export LD_LIBRARY_PATH=/usr/local/cuda-11.4/compat

However, this may lead to other unforeseen issues.

orenshaniathuji · July 22, 2024, 8:11am

OK it truns out that the problem is simply that the nodejs root path has to be set. So the script used to start cryosparc from inside the container has to include

export PATH=$PATH:/app/cryosparc_master/bin:/app/cryosparc_worker/bin:/app/cryosparc_master/cryosparc_app/nodejs/bin

...

npm config set prefix /app/cryosparc_master/cryosparc_app/nodejs

cryosparcm start