Extract from Micrographs cufftExecFailed error

Bruk · August 29, 2018, 6:22pm

Hi,

all of my 2D classification jobs are failing with the following error:

Traceback (most recent call last):
File “cryosparc2_worker/cryosparc2_compute/run.py”, line 69, in cryosparc2_compute.run.main
File “cryosparc2_compute/jobs/jobregister.py”, line 260, in get_run_function
runmod = importlib.import_module("…"+modname, name)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/importlib/init.py”, line 37, in import_module
import(name)
File “cryosparc2_worker/cryosparc2_compute/jobs/class2D/run.py”, line 15, in init cryosparc2_compute.jobs.class2D.run
File “cryosparc2_compute/engine/init.py”, line 8, in
from engine import *
File “cryosparc2_worker/cryosparc2_compute/engine/engine.py”, line 11, in init cryosparc2_compute.engine.engine
File “cryosparc2_worker/cryosparc2_compute/engine/gfourier.py”, line 6, in init cryosparc2_compute.engine.gfourier
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 13, in
from . import cufft
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 238, in
_libcufft.cufftSetCompatibilityMode.restype = int
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/ctypes/init.py”, line 379, in getattr
func = self.getitem(name)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/ctypes/init.py”, line 384, in getitem
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/local/cuda/lib64/libcufft.so: undefined symbol: cufftSetCompatibilityMode

stephan · October 12, 2018, 4:25pm

Hey @Bruk,

Can you use CUDA 8.0, and reinstall your worker node with it (AttributeError: undefined symbol: cufftSetCompatibilityMode (V2))

Bruk · October 16, 2018, 6:07pm

Hi,

I was able to fix the 2D classification problem by installing the worker node using CUDA 9.1 already. However, I get similar issues in “extract from micrograph jobs” (error below) and reinstalling the worker node using CUDA 8.0 does not help.

Starting multithreaded pipeline …

Started pipeline

GPU 0 using a batch size of 1024

– 0.0: processing J1/imported/Frame2_0041.mrc

– 0.1: processing J1/imported/Frame2_0042.mrc

Traceback (most recent call last):
File “cryosparc2_compute/jobs/runcommon.py”, line 747, in run_with_except_hook
run_old(*args, **kw)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “cryosparc2_compute/jobs/pipeline.py”, line 53, in stage_target
work = processor.process(item)
File “cryosparc2_compute/jobs/extract/run.py”, line 268, in process
cuda_dev = self.cuda_dev, ET=self.ET, timer=timer, mic_idx=mic_idx, batch_size = self.batch_size)
File “cryosparc2_compute/jobs/extract/extraction_gpu.py”, line 208, in do_extract_particles_single_mic_gpu
skcuda.fft.ifft( ET.output_f_gpu[:batch_size], ET.output_gpu[:batch_size], ifft_plan ,scale=True)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 298, in ifft
return _fft(x_gpu, y_gpu, plan, cufft.CUFFT_INVERSE, y_gpu.size/plan.batch)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 198, in _fft
int(y_gpu.gpudata))
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 301, in cufftExecC2R
cufftCheckStatus(status)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 110, in cufftCheckStatus
raise cufftExceptions[status]
cufftExecFailed

stephan · October 16, 2018, 9:46pm

Hi @Bruk,

We are currently working on a fix to this issue. For the time being, you can run the job in CPU- only mode by specifying “0” in the “Number of GPUs to parallelize (0 for CPU-only)” option.

ZhijieLi · November 18, 2018, 8:54pm

Hi,

I just got the same error with cryosparc v2.4.2 (I have only ever installed cuda-8.0).

It seems that this has to do with the specified box size. When I put 128 it occurs, when I put 160 it goes away.

My error message:
Traceback (most recent call last):
File “cryosparc2_compute/jobs/runcommon.py”, line 747, in run_with_except_hook
run_old(*args, **kw)
File “/home/local/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “cryosparc2_compute/jobs/pipeline.py”, line 53, in stage_target
work = processor.process(item)
File “cryosparc2_compute/jobs/extract/run.py”, line 268, in process
cuda_dev = self.cuda_dev, ET=self.ET, timer=timer, mic_idx=mic_idx, batch_size = self.batch_size)
File “cryosparc2_compute/jobs/extract/extraction_gpu.py”, line 208, in do_extract_particles_single_mic_gpu
skcuda.fft.ifft( ET.output_f_gpu[:batch_size], ET.output_gpu[:batch_size], ifft_plan ,scale=True)
File “/home/local/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 298, in ifft
return _fft(x_gpu, y_gpu, plan, cufft.CUFFT_INVERSE, y_gpu.size/plan.batch)
File “/home/local/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 198, in _fft
int(y_gpu.gpudata))
File “/home/local/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 301, in cufftExecC2R
cufftCheckStatus(status)
File “/home/local/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 110, in cufftCheckStatus
raise cufftExceptions[status]
cufftExecFailed

stephan · December 10, 2018, 9:43pm

Hi @ZhijieLi,

Do you still get this failure?

stephan · December 21, 2018, 5:38pm

Hi Everyone,

It turns out there was a bug in one of our CUDA kernels, which we have fixed. We will be releasing a new version of cryoSPARC soon that will include this fix! I will update this post with the details once it has been released.

stephan · January 11, 2019, 5:55pm

Hi @ZhijieLi, @Bruk,

The fix for this bug has been released in v2.5.

Thanks!

ZhijieLi · January 11, 2019, 6:45pm

Hi Stephan,
Thanks! Extracting with 256 pix now works!
Zhijie