Wrong CUDA version (fixed)

closed

#1

Fixed. Apparently it does not work with CUDA 10.x. In this file:

cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cusolver.py

There is a list of library versions to load:

_version_list = [9.2, 9.1, 9.0, 8.0, 7.5, 7.0]

So I switched to 9.2 and now motion correction works.

The problem description…

I’ve installed CryoSPARC on our LInux cluster. I tried to install it with the latest version of the CUDA libraries that we have installed, which is version 10.1. Going through the T20S tutorial, when I run a motion correction job, it fails with this error in job.log:

AttributeError: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcusolver.so: undefined symbol: cusolverDnCreateSyevjInfo

which indicates that it is trying to use cuda 8.0 instead of 10.1

In particular, I installed the worker like this:

module load cudatoolkit/10.1
module load cudnn/cuda-10.1
./install.sh --license $LICENSE_ID --cudapath /usr/local/cuda-10.1

This puts the following in the PATH and LD_LIBRARAY_PATH of the cryosparc user:

PATH...
/usr/local/cuda-10.1/bin
LD_LIBRARY_PATH...
/usr/local/cudnn/cuda-10.1/7.5.0/lib64
/usr/local/cuda-10.1/lib64

and /usr/local/cuda is linked as follows:

/usr/local/cuda -> cuda-10.1

I also have in this in cluster_script.sh (which I’ve installed with “cryosparcm cluster connect”):

module load cudatoolkit/10.1
module load cudnn/cuda-10.1

What could be causing the job to try to use 8.0?

Any help appreciated,
Matthew Cahn