Fixed. Apparently it does not work with CUDA 10.x. In this file:
cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cusolver.py
There is a list of library versions to load:
_version_list = [9.2, 9.1, 9.0, 8.0, 7.5, 7.0]
So I switched to 9.2 and now motion correction works.
The problem description…
I’ve installed CryoSPARC on our LInux cluster. I tried to install it with the latest version of the CUDA libraries that we have installed, which is version 10.1. Going through the T20S tutorial, when I run a motion correction job, it fails with this error in job.log:
AttributeError: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcusolver.so: undefined symbol: cusolverDnCreateSyevjInfo
which indicates that it is trying to use cuda 8.0 instead of 10.1
In particular, I installed the worker like this:
module load cudatoolkit/10.1
module load cudnn/cuda-10.1
./install.sh --license $LICENSE_ID --cudapath /usr/local/cuda-10.1
This puts the following in the PATH and LD_LIBRARAY_PATH of the cryosparc user:
PATH...
/usr/local/cuda-10.1/bin
LD_LIBRARY_PATH...
/usr/local/cudnn/cuda-10.1/7.5.0/lib64
/usr/local/cuda-10.1/lib64
and /usr/local/cuda is linked as follows:
/usr/local/cuda -> cuda-10.1
I also have in this in cluster_script.sh (which I’ve installed with “cryosparcm cluster connect”):
module load cudatoolkit/10.1
module load cudnn/cuda-10.1
What could be causing the job to try to use 8.0?
Any help appreciated,
Matthew Cahn