I’ve installed CryoSPARC 3.3.2 on RHEL 8.5 (really Springdale 8.5, but that’s derived from RHEL 8.5). I’m using cudatoolkit 11.1. When I run the Patch Motion Correction (multi) in the tutorial, I get this:
pycuda.driver.CompileError: nvcc preprocessing of /tmp/tmpupyjujj0.cu failed
[command: nvcc --preprocess -arch sm_80 -I/projects/MOLBIO/local/cryosparc-della-test-2/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/cuda /tmp/tmpupyjujj0.cu --compiler-options -P]
[stderr:
b'cc1plus: fatal error: cuda_runtime.h: No such file or directory\ncompilation terminated.\n']
I tried adding several things to cluster_script.sh, one at a time, which did not help.
I am still unclear about the root cause of this particular problem.
Your best bet may be to keep the environment at the time of installation as similar as possible to the environment at the time of running a cryoSPARC job.
The critical parts are:
the nvidia driver (controlled by the root user)
the nvidia toolkit (potentially controlled by a non-root user, see --toolkit, --toolkitpath=, --defaultroot=cuda installation options)
installation/(re-)configuration of the cryoSPARC worker package (when one runs cryosparc_worker/install.sh .. or cryosparc_worker/bin/cryosparcw newcuda ..)
This may imply in your situation:
installing the CUDA toolkit and the cryoSPARC worker package as a cluster job.
sharing CUDA toolkit and cryoSPARC worker installation trees only between cluster nodes with “similar enough” (intentionally vague) nvidia drivers.
updating the CUDA toolkit whenever the nvidia driver has “significantly” (intentionally vague) changed.
in turn, running cryosparcw newcuda /path/to/new/cuda whenever the CUDA toolkit installation has changed.
ensuring the same compilers and libraries are available at installation/reconfiguration and cryoSPARC run time (to which you have already alluded).
Although this problem persisted through version 4.1.0, in 4.1.1 it appears to be resolved. I ran the update to 4.1.1, and I’m able to run a patch motion correction job without having to copy the header files the way I did before.