pycuda._driver.MemoryError: cuMemHostAlloc failed: out of memory in v3.0

stephan · March 24, 2021, 9:18pm

Is there any way you can re-install cryoSPARC on an OS like Ubuntu? So far, that’s the most reliable way to get rid of this error. If that’s not possible, then we have a (potential) fix coming out in the next release, which should be very soon.

CleoShen · March 25, 2021, 5:35pm

Thank you for your advice. A naive question: How to manually update the new cryoSPARC once you release it?

stephan · March 25, 2021, 5:41pm

Hi @CleoShen,

You can update to any version of cryoSPARC by running the command cryosparcm update --version=<cryoSPARC version>
More information here: https://guide.cryosparc.com/setup-configuration-and-management/software-updates

jiangq9992003 · February 24, 2022, 6:44pm

Hi Stephan and others in this group,
I tested v.3.3.1 with a small T20S data set in a workstation (CentOS 7.5) containing 4 RTX3090 GPUs. The test went well. But when I imported a larger set of particles from a Relion job (~370,000 with a box of 432 x 432), 2D classification failed after 4 runs due to “pycuda._driver.MemoryError: cuMemHostAlloc failed: out of memory”. Each GPU has 24 GB memory with Driver Version: 460.27.04 and CUDA Version: 11.2. GPU1-3 were used and memory usage was relatively low most of the time (less than 10 GB).
I wonder if you ever encountered the same error. GPU0 was not used due to X-server.
Thanks for your attention.
Qiu-Xing

user123 · October 17, 2022, 5:00pm

I’m also on v3.3.1 +220315 on CentOS7 and now continuously getting “pycuda._driver.MemoryError: cuMemHostAlloc failed: out of memory” on small jobs (class2D, ab initio, and homo ref with binned particles). These jobs may complete normally when cloned. nvidia-smi indicates less than 1gb of memory used at the time of failure w/ fan 47% and temp 52C and power 123/350 W for a 3080Ti.

I’ve been running stably on 3.3.1 +220315 for months and the failure rate has increased a lot recently despite no update to software or drivers.

wtempel · October 17, 2022, 7:50pm

@user123 @jiangq9992003
cuMemHostAlloc relates to host (rather than GPU) memory.
Do your cryosparc_worker/config.sh files define
export CRYOSPARC_NO_PAGELOCK=true?
Please see CUDA memory error during 2D classification - #9 by spunjani for a related discussion.

Sangwon · August 25, 2023, 1:37pm

Hi,
I would like to know if someone found a solution for this ‘out of memory’ issue.
I have encountered same error during ab initio reconstruction with v4.2.1 (CUDA ver 11.3, CentOS-7)
I have ~3mil particles with box size of 300.
Thanks!

wtempel · August 25, 2023, 3:03pm

Please post the text of the error message(s) and traceback(s) from the Event Log
and job log (Metadata|Log).
In case you observed precisely cuMemHostAlloc,

Sangwon · August 28, 2023, 1:41pm

@wtempel
Thank you so much for your answer.
Adding export CRYOSPARC_NO_PAGELOCK=true in cryosparc_worker/config.sh worked for me - and it is running without errors.