Heterogeneous refinement fail error

HI,

I have some error in heterogeneous refinement.
Other jobs (NU-refinement, homogeneous refinement) are going well without any problems.

[CPU: 7.67 GB]   Traceback (most recent call last):
  File "cryosparc2_compute/jobs/runcommon.py", line 1685, in run_with_except_hook
    run_old(*args, **kw)
  File "cryosparc2_worker/cryosparc2_compute/engine/cuda_core.py", line 110, in cryosparc2_compute.engine.cuda_core.GPUThread.run
  File "cryosparc2_worker/cryosparc2_compute/engine/cuda_core.py", line 111, in cryosparc2_compute.engine.cuda_core.GPUThread.run
  File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 991, in cryosparc2_compute.engine.engine.process.work
  File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 109, in cryosparc2_compute.engine.engine.EngineThread.load_image_data_gpu
  File "cryosparc2_worker/cryosparc2_compute/engine/gfourier.py", line 33, in cryosparc2_compute.engine.gfourier.fft2_on_gpu_inplace
  File "/home/cryosparc_user/cryosparc2-40000/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py", line 127, in __init__
    onembed, ostride, odist, self.fft_type, self.batch)
  File "/home/cryosparc_user/cryosparc2-40000/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py", line 742, in cufftMakePlanMany
    cufftCheckStatus(status)
  File "/home/cryosparc_user/cryosparc2-40000/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py", line 117, in cufftCheckStatus
    raise e
cufftAllocFailed

and this is metadata

  "last_exported": "2020-07-08T08:34:43.401Z",
  "queued_to_hostname": false,
  "queued_to_gpu": false,
  "no_check_inputs_ready": false,
  "num_tokens": 1,
  "job_sig": "18960100982852638736897948817806432518007455997845772495145924392144658910508188792244347481040487282790693089305083206744700499961214847646295490689138477244133550298171021657604570455425491255603469262140089135582372593715871919356527718316079778323970782432751360767905444012670867751235141597927927116527819044675468924162744508020147336741396405480808909664982366292909865808710132638331751987985275422317972190490949623944052400744872184701017850807391589472904098917068908109353636963693506296647223047105073030068537080834575746637725084851025560707300013924381696832014358093587714084003258883609686211108239",
  "tokens_acquired_at": 1594197313.790468,
  "status_num": 40,
  "progress": []
}

Can someone tell me why this error happened?
Thank you!

Also I checked heterogeneous refinement under 6 classes has no problem.
It seems like causes problems with more than 7 classes.

Hi @KSJ, GPU memory use for Heterogenous refinement increases linearly as you add more classes. This error happens when cryoSPARC can’t allocate the memory it needs on the GPU to perform the full refinement.

We’re always working on improving memory use in cryoSPARC, so you may be able to use more classes in future versions of cryoSPARC. Until then, the only thing I can suggest is using fewer classes (as you’ve already done) or using a GPU with more memory.

Hope that helps,

Nick

@nfrasser

It was very helpful!

Thank you for reply :blush: