I have been trying to test my cryosparc install with the extensive workflow for T20S. The import worked fine, however the patch motion is giving me the following error:
[CPU: 197.5 MB] Traceback (most recent call last):
File "cryosparc2_worker/cryosparc2_compute/run.py", line 85, in cryosparc2_compute.run.main
File "cryosparc2_master/cryosparc2_compute/jobs/motioncorrection/run_patch.py", line 363, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi
AssertionError: Child process with PID 9312 has terminated unexpectedly!
I have followed previous threds with similar errors that seem to have been fixed through updating to more recent versions of cryosparc or with patches but I have these all up to date (Current version: v2.15.0+200728) but I am still getting this error.
For some additional information, I installed following quick installation instructions for a single workstation. The workstation has an AMD CPU with 3x2080Ti. I have recently installed CUDA 10.0 in addition to CUDA 8.0 but have carefully installed cryosparc with the cuda path to usr/local/cuda-10.0 so this shouldn’t be an issue. I did accidentally at one point install cuda 11 as some instructions on the nvidia website were a little unclear but as far as I’m aware I managed to purge this and autoremoved other dependencies so I don’t think this will be the issue.
Another error which may or may not be related occurred after I tried to run a 2D classification of some imported particles:
ImportError: libcurand.so.8.0: cannot open shared object file: No such file or directory
Although this seems CUDA toolkit related as I have libcurand.so.10.0 as part of cuda-10.0 so this is likely to be cuda related but not sure how to fix this either. I’m stumped and can’t get my install to work, help!
Hi @Lucy, assuming you have complete CUDA 10 installation, you may just have to re-install the CUDA-specific dependencies in the cryosparc2_worker folder. Here’s how you do that:
Navigate to where you installed the cryosparc2_worker via command line
cd /path/to/cryosparc2_worker
Enter the following variables, changing the CUDA_PATH with the correct path (if it differs):
If you see any errors, please send over the output. If you see no errors, but the Patch Motion job still doesn’t work, I suggest you reinstall CUDA and retry the instructions above.
Thank you for your repsonse and sorry for the delay, shortly after following your advice Ubuntu completely crashed! I suspect it was something to do with CUDA so I deleted CUDA10 and managed to restore Ubuntu. I have now reinstalled CUDA10 and followed your advice to export variables and re-run the install.sh.
The output from the installation looked fine apart from this when connecting the worker to the master:
ERROR: This hostname is already registered! Remove it first.
Please re-start your terminal shell to make the cryosparcm
command available.
When I try the patch motion I get exactly the same as before (see below). You suggested re-installing CUDA but I have just re-installed with no errors or problems. Maybe I should try CUDA 10.1?
[CPU: 197.9 MB] Traceback (most recent call last):
File “cryosparc2_worker/cryosparc2_compute/run.py”, line 85, in cryosparc2_compute.run.main
File “cryosparc2_master/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 363, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi
AssertionError: Child process with PID 19439 has terminated unexpectedly!
Ok, so since I still think the problem lies with CUDA, I have updated my bashrc and symbolic links to all point toward CUDA-10.0 rather than cuda 8.
After doing this I re-ran the installer and I got this at the end:
Autodetecting available GPUs…
Traceback (most recent call last):
File “bin/connect.py”, line 231, in
gpu_devidxs = check_gpus()
File “bin/connect.py”, line 105, in check_gpus
num_devs = print_gpu_list()
File “bin/connect.py”, line 22, in print_gpu_list
import pycuda.driver as cudrv
File “/home/lucytroman/Software/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/pycuda/driver.py”, line 5, in
from pycuda._driver import * # noqa
ImportError: libcurand.so.8.0: cannot open shared object file: No such file or directory
So clearly there is something looking for CUDA 8 associated software still as libcurand.so.8.0 is from the CUDA 8.0 toolkit.
I noticed someone else had a similar problem here: https://github.com/cryoem-uoft/cryosparc-issues/issues/187
This was solved by clearing the pip/cache: rm -rf ~/.cache/pip
(I didn’t want to do this so moved current pip dir to oldpip and made a new pip directory in it’s place as I didn’t want to delete something important.)
However, when I again repeat install.sh as before I get the same error as above. I had installed cryosparc previously with CUDA 8 and clearly it has stored some pathways based on this. How do I undo this? I could try a completely fresh install of cryosparc? What command would I use to make sure I fully removed it?
pycuda must have stored a pathway to cuda 8 from a previous install, despite all paths and configurations pointing it toward cuda10.0. To summarise from previous thread:
navigate to cryosparc2_worker
Execute eval $(bin/cryosparcw env)
Execute pip uninstall pycuda # This will tell you which version it is uninstalling (eg. 2019.1)
d. Execute pip install pycuda==2019.1 --no-cache-dir # replace 2019.1 with whichever version it just uninstalled for the cryosparc version. This requires an internet connection