Error during homogeneous refinement

  File "cryosparc2_compute/jobs/runcommon.py", line 747, in run_with_except_hook
    run_old(*args, **kw)
  File "cryosparc2_worker/cryosparc2_compute/engine/cuda_core.py", line 101, in cryosparc2_compute.engine.cuda_core.GPUThread.run
  File "cryosparc2_worker/cryosparc2_compute/engine/cuda_core.py", line 102, in cryosparc2_compute.engine.cuda_core.GPUThread.run
  File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 987, in cryosparc2_compute.engine.engine.process.work
  File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 107, in cryosparc2_compute.engine.engine.EngineThread.load_image_data_gpu
  File "cryosparc2_worker/cryosparc2_compute/engine/gfourier.py", line 33, in cryosparc2_compute.engine.gfourier.fft2_on_gpu_inplace
  File "/data/CRYOSPARC/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py", line 126, in __init__
    onembed, ostride, odist, self.fft_type, self.batch)
  File "/data/CRYOSPARC/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py", line 741, in cufftMakePlanMany
    cufftCheckStatus(status)
  File "/data/CRYOSPARC/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py", line 116, in cufftCheckStatus
    raise e
cufftAllocFailed

Any idea what might be causing this?

Hi @Omid,

Can you give us more information? The GPU you were using, the CUDA version you were using, the parameters of your refinement job (box size, symmetry), etc. would be helpful!

Hi @stephan

I am running a GPU workstation with 4 RTX 2080 and cuda 10. Using cryosparc 2.4.6. C4 symmetry and box size is 600. I left all other parameters as the default.

Hi @Omid,

Is it possible if you can update cryoSPARC to the latest version?
Also, is it possible if you can try a smaller box size (320, 400, 448, or 512) to see if thats the problem?

Hi @stephan,

I am running the latest I think, but I can double check when I get back from a conference. I only installed cryosparc on this workstation little over a month ago. Using a smaller box size does work but I don’t understand why 600 box size is too big for a 11 GB GPU. I can also see that each iteration is much better with the larger box size but the refinement stops after 3 iteration with the error above or a out of memory error. Must be some other issue but it is beyond me expertise.

So after some troubleshooting the problem seem to occur during dynamic masking. Going to 546 box size crashes the refinement at an earlier stage now, but box size 520 is fine.