skcuda.cufft.cufftAllocFailed error in Extract Particles

Hi all,

I always me a error like blew when extract paticles from micrograph.And I don’t think I didn’t have enough GPU memory,because my GPU is rtx 8000,I didn’t use up the memory.So is there other problem.Could you give me some advice.Thanks very much.

[CPU: 1.62 GB]   Traceback (most recent call last):
  File "/home/spider/software/cryospooarc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 1726, in run_with_except_hook
    run_old(*args, **kw)
  File "/home/spider/software/cryospooarc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/spider/software/cryospooarc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py", line 86, in stage_target
    work = processor.exec(item)
  File "/home/spider/software/cryospooarc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py", line 43, in exec
    return self.process(item)
  File "/home/spider/software/cryospooarc/cryosparc_worker/cryosparc_compute/jobs/extract/run.py", line 469, in process
    update_alignments3D=update_alignments3D)
  File "/home/spider/software/cryospooarc/cryosparc_worker/cryosparc_compute/jobs/extract/extraction_gpu.py", line 149, in do_extract_particles_single_mic_gpu
    stream=stream)
  File "/home/spider/software/cryospooarc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/fft.py", line 127, in __init__
    onembed, ostride, odist, self.fft_type, self.batch)
  File "/home/spider/software/cryospooarc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cufft.py", line 742, in cufftMakePlanMany
    cufftCheckStatus(status)
  File "/home/spider/software/cryospooarc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cufft.py", line 117, in cufftCheckStatus
    raise e
skcuda.cufft.cufftAllocFailed

Hi @wonderful, could you please run the nvidia-smi command on the machine where you’re seeing this issue and send me the output?

I have saw this message before,but I don’t get a snapshot.And I read the output of this conmand.I find a GPU will don’t work but other still work.I don’t know why a GPU didn’t work sometimes when the job run a period of time.

I’m trying to see what version of CUDA you are running. Try switching between to CUDA 10 if you have CUDA 11, or to CUDA 10 instead of CUDA 11 and see if that fixes it.