Error while trying patch motion correction

Hi, I’m not that familiar with linux, so any help regarding this matter would be super helpful
I just recently installed cryosparc on a new computer, seem like everything working fine with importing movie, but when I’m trying to do patch motion correction, I encountered this error:
[CPU: 263.2 MB]
Error occurred while processing J1/imported/007453813011304313375_22aug05b_grid2_00003gr_00003sq_00002hl_00002es.frames.tif
Traceback (most recent call last):
File “/home/keck/cryosparc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py”, line 60, in exec
return self.process(item)
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 177, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 180, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 182, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/patchmotion.py”, line 255, in cryosparc_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/patchmotion.py”, line 668, in cryosparc_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/cuda_kernels.py”, line 811, in cryosparc_compute.jobs.motioncorrection.cuda_kernels.do_unbend_gpu
File “cryosparc_master/cryosparc_compute/engine/cuda_core.py”, line 416, in cryosparc_compute.engine.cuda_core.context_dependent_memoize.wrapper
File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/cuda_kernels.py”, line 797, in cryosparc_compute.jobs.motioncorrection.cuda_kernels.get_unbend_gpu
File “/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/compiler.py”, line 291, in init
arch, code, cache_dir, include_dirs)
File “/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/compiler.py”, line 254, in compile
return compile_plain(source, options, keep, nvcc, cache_dir, target)
File “/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/compiler.py”, line 137, in compile_plain
stderr=stderr.decode(“utf-8”, “replace”))
pycuda.driver.CompileError: nvcc compilation of /tmp/tmp_vxzeutl/kernel.cu failed
[command: nvcc --cubin -arch sm_86 -I/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/cuda kernel.cu]
[stderr:
kernel.cu(42): error: texture is not a template

kernel.cu(265): error: no instance of overloaded function “tex2D” matches the argument list
argument types are: (, float, float)

2 errors detected in the compilation of “kernel.cu”.
]

Marking J1/imported/007453813011304313375_22aug05b_grid2_00003gr_00003sq_00002hl_00002es.frames.tif as incomplete and continuing…

Any help is appreciated!

Welcome to the forum @stjiafle Please can you email us the job reports for this and the upstream (J1) import job.

Hi! Thank you for the reply, I will email them shortly with more information!

After seeing the reports you sent us by e-mail, I suspect the CRYOSPARC_CUDA_PATH points to a version of the toolkit (10.1) that does not support your GPU device (RTX A5000, NVIDIA NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation - #2 by ptrblck - PyTorch Forums). For a resolution, you may try

  1. update to CryoSPARC v4.1.1
    and then
  2. install 3DFlex dependencies, which, as a side effect, makes available a version > 10 of the CUDA toolkit

Hi!
I’m getting these errors

error: command '/usr/bin/gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> pycuda

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
check_install_deps.sh: 66: ERROR: installing python failed.
Failed to update keck-cryo2! Skipping...
 -------------------------------------------------
 ---------------------------------------------------
 Done updating all worker nodes.
 If any nodes failed to update, you can manually update them.
 Cluster worker installations must be manually updated.
 
 To update manually, copy the cryosparc_worker.tar.gz file into the
 cryosparc worker installation directory, and then run 
    $ bin/cryosparcw update 
 from inside the worker installation directory.

and when I tried updated the worker manually it said already done

~/cryosparc/cryosparc_worker$ bin/cryosparcw update
Updating... checking versions
Current version v4.1.1 - New version v4.1.1 
Already up to date

but installing 3DFlex dependencies gives me this error

Trying to restart my job still gives the same error,
nvcc --version also still shows cuda 10

Notwithstanding the
PyTorch not installed correctly, or NVIDIA GPU not detected.
warning, motion correction and other CryoSPARC GPU jobs may still work.
Have you tried the motion correction job again?

I did, I’m still getting the same error

I am sorry this is still not working for you.
Please can you email us the job report for this attempt also, along with the output of the command
cryosparcw call /usr/bin/env
Thanks.

Thank you for sending the logs.
Please can you post the output of these commands

/home/keck/cryosparc/cryosparc_worker/bin/cryosparcw call which nvcc
echo "" | /home/keck/cryosparc/cryosparc_worker/bin/cryosparcw call nvcc -v -E -

Hi,
this is the output of those commands

keck@keck-cryo2:~$ /home/keck/cryosparc/cryosparc_worker/bin/cryosparcw call which nvcc
/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/nvcc
keck@keck-cryo2:~$ echo "" | /home/keck/cryosparc/cryosparc_worker/bin/cryosparcw call nvcc -v -E -
#$ _NVVM_BRANCH_=nvvm
#$ _SPACE_= 
#$ _CUDART_=cudart
#$ _HERE_=/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin
#$ _THERE_=/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_SIZE_=64
#$ TOP=/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/..
#$ NVVMIR_LIBRARY_DIR=/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/../nvvm/libdevice
#$ LD_LIBRARY_PATH=/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/../lib:/usr/local/cuda/lib64:/home/keck/cryosparc/cryosparc_worker/deps/external/cudnn/lib
#$ PATH=/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/../nvvm/bin:/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/usr/local/cuda/bin:/home/keck/cryosparc/cryosparc_worker/bin:/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/home/keck/cryosparc/cryosparc_worker/deps/anaconda/condabin:/home/keck/cryosparc/cryosparc_master/bin:/home/keck/cryosparc/cryosparc_master/bin:/home/keck/cryosparc/cryosparc_master/bin:/home/keck/cryosparc/cryosparc_master/bin:/home/keck/cryosparc/cryosparc_master/bin:/home/keck/cryosparc/cryosparc_master/bin:/home/keck/cryosparc/cryosparc_master/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin
#$ INCLUDES="-I/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/../include"  
#$ LIBRARIES=  "-L/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/../lib/stubs" "-L/home/keck/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/../lib"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
nvcc fatal   : Don't know what to do with '/tmp/tmpxft_0000df21_00000000-1_stdin'

So I was able to get the job patch motion correction running after I did this:
Update ubuntu from 20.04 to 22.04
Purge cuda
reinstall nvidia cuda toolkit
reinstall driver
install nvidia utils → which inturn removed nvidia cuda toolkit
(if I tried to reinstall nvidia cuda toolkit it said it will remove nvidia util - I dont know what to do at this point)
Right now if I run command

nvcc --version

it said:

Command 'nvcc' not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit

what do you think? will I have problem running my jobs upstream in the future?

If CryoSPARC motion correction is working, I would hold off on installing a “system-wide” CUDA toolkit.
Should you require a CUDA toolkit for purposes other than CryoSPARC processing, consider installation of the toolkit as a non-root user, under a custom path. This can be achieved through a “runfile” installation and using the
--toolkit, --toolkitpath=, --defaultroot= options.

A post was split to a new topic: Refused connection after computer restart (to be confirmed)