Homogeneous refinement cufftAllocFailed

Shinpei · May 30, 2020, 3:15pm

Hi cryoSPARC team,

My jobs ran successfully until the step homogeneous refinement, cryoSPARC showed error message, but I can’t find the point that I did wrong.

In other discussion, it said that cufftAllocFailed means my GPU memory run out, our GPU memory is just 8GB and it is lower than you suggest indeed, but I want to know why the job failed instead of slowing down the speed.

Or the failure was caused by other reasons?

There are some information about our system and the error message below:

• cryoSPARC version: v2.15.0
• CUDA version: 9.2
• OS: Ubuntu (linux mint)
• Jobs: P2 J16

• CPU: Intel® Core™ i7-8700 CPU @ 3.20GHz
• GPU: Nvidia GeForce GTX 1070 Ti
• HDD: 1T
• SSD: X
• Ram: DDR4 -64G

License is valid.

Launching job on lane default target yulab-i7-server ...

Running job on master node hostname yulab-i7-server

[CPU: 98.8 MB]   Project P2 Job J16 Started

[CPU: 98.9 MB]   Master running v2.15.0, worker running v2.15.0

[CPU: 99.1 MB]   Running on lane default

[CPU: 99.1 MB]   Resources allocated: 

[CPU: 99.1 MB]     Worker:  yulab-i7-server

[CPU: 99.1 MB]     CPU   :  [0, 1, 2, 3]

[CPU: 99.1 MB]     GPU   :  [0]

[CPU: 99.1 MB]     RAM   :  [0, 1, 2]

[CPU: 99.1 MB]     SSD   :  False

[CPU: 99.1 MB]   --------------------------------------------------------------

[CPU: 99.1 MB]   Importing job module for job type homo_refine...

[CPU: 206.3 MB]  Job ready to run

[CPU: 206.3 MB]  ***************************************************************

[CPU: 267.7 MB]  Using random seed of 1268114094

[CPU: 267.7 MB]  Loading a ParticleStack with 34835 items...

[CPU: 303.5 MB]    Done.

[CPU: 303.5 MB]  Windowing particles

[CPU: 311.3 MB]    Done.

[CPU: 314.6 MB]  ====== Refinement ======

[CPU: 314.6 MB]    Refining Structure with volume size 640.

[CPU: 5.45 GB]     Starting at initial resolution 30.000A (radwn 18.560). 

[CPU: 8.40 GB]     Estimating scale of initial reference. 

[CPU: 8.69 GB]     Rescaling initial reference by a factor of 1.095 

[CPU: 8.69 GB]     Estimating scale of initial reference. 

[CPU: 8.69 GB]     Rescaling initial reference by a factor of 0.996 

[CPU: 8.69 GB]     Estimating scale of initial reference. 
[CPU: 8.69 GB]     Rescaling initial reference by a factor of 0.994

[CPU: 17.80 GB]  Traceback (most recent call last):
  File "cryosparc2_compute/jobs/runcommon.py", line 1685, in run_with_except_hook
    run_old(*args, **kw)
  File "cryosparc2_worker/cryosparc2_compute/engine/cuda_core.py", line 110, in cryosparc2_compute.engine.cuda_core.GPUThread.run
  File "cryosparc2_worker/cryosparc2_compute/engine/cuda_core.py", line 111, in cryosparc2_compute.engine.cuda_core.GPUThread.run
  File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 991, in cryosparc2_compute.engine.engine.process.work
  File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 109, in cryosparc2_compute.engine.engine.EngineThread.load_image_data_gpu
  File "cryosparc2_worker/cryosparc2_compute/engine/gfourier.py", line 33, in cryosparc2_compute.engine.gfourier.fft2_on_gpu_inplace
  File "/home/yirbigbird/cryosparc_user/software/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py", line 127, in __init__
    onembed, ostride, odist, self.fft_type, self.batch)
  File "/home/yirbigbird/cryosparc_user/software/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py", line 742, in cufftMakePlanMany
    cufftCheckStatus(status)
  File "/home/yirbigbird/cryosparc_user/software/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py", line 117, in cufftCheckStatus
    raise e
cufftAllocFailed

I am looking forward to your reply.

Feng10 · June 1, 2020, 5:31am

Hi @Shinpei,

cufftAllocFailed seems like a memory issue. Your GPU 1070 Ti has only 8 GB memory, which I guess is not very sufficient. Maybe use less particles or smaller box size or bin more will help?

Best,
Feng10

stephan · June 9, 2020, 2:24pm

Hi @Shinpei,

It seems like the box size of 640px can’t fit on the 8GB GPU- like @Feng10 suggested, is it possible if you can try reconstructing this volume in a smaller box size?
Also, this job is not built to switch to a codepath that is slower if it encounters an insufficient memory error. Sorry!

Shinpei · June 12, 2020, 4:02pm

Hi @stephan @Feng10 ,

Thanks for your suggestion and help!

I used smaller box size and it can complete the job.
We will upgrade our hardware, the situation would be much better.

Again, appreciate for your reply.