3dva: cuda_error_out_of_memory

Hi, I have been getting an out of memory error when running 3DVA:
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY

the job runs through the 2 Initial reconstruction steps without issue. The error starts at the start of iteration 0.

I updated all drivers and to the latest cryosparc hoping it would solve the issue but it did not help. I have 2 RTX2070 GPUs. I don’t get this error on a newer machine that has 2 RTX3090 GPUs installed.

Is there anything I can do about this or the RTX2070 just physically doesn’t have enough memory to run 3DVA? All other jobs run without issues (patch motion and ctf, NU refinements etc.)

thanks!
Ben

8GB of RAM really struggles with complex stuff like 3DVA and 3Dflex. If the box size is small enough, it might be OK, but I’m very close to decommissioning a long-serving quad-RTX2080 box because 8GB just isn’t enough for more complex tasks now. :frowning: The 2070 will suffer a similar problem.

yes thats what I thought :expressionless: I guess we will have to think about that also.
Thanks for your input! :slight_smile:

Hi again,
I spoke too soon. Actually this sample crashed also when i switched to our newer machine with the RTX3090s.

It made it to iteration 3 at least before crashing… the only error message is: Job is unresponsive - no heartbeat received in 180 seconds. And actually it completely crashes cryosparc. I need to reboot the machine to start cryosparc again.

I thought the newer machine would be fine since I ran 3dva on a different dataset of the same protein complex without any issue. The only difference between the 2 datasets is the current dataset has a substrate bound (and I am using cryosparc v4.4 instead of v4.3). Otherwise the 2 datasets are very similar (similar data collection parameters, extraction box size, pixel size etc)

does anyone know what the problem could be? or how I could trouble shoot this?
thanks!
Ben

What is your box size? Even on 3090s, 3D-VA will often run out of memory with box sizes >300pix (400pix sometimes ok but is a stretch). Have you tried downsampling your particles?

box is 500… I thought to do downsample but a 500 box size ran fine on a previous dataset with ~175000 particles. this data has ~80000 particles… I didn’t think the 500 box size would be a problem then…

500px box size 3D-VA will regularly crash on a 3090 - I would suggest downsampling to 300px

ok thanks. I will try that :slight_smile:
but why didn’t it crash with the previous dataset with a 500 box size then? I ran many 3dva runs fiddling with the parameters and the runs never crashed with the 500box size before…

also. should I rerun the consensus refinement with the downsampled particles before running the 3dva? or is it safe to just feed the downsampled particles directly into the 3dva job?

You can just downsample and rerun directly, no need to rerun the consensus refinement.

Not sure why it didn’t crash before - maybe the number of components? - but in our hands it is not stable above 300px box size

ok thanks alot :slight_smile: I will do that then.
no same number of components… strange. i guess it was my lucky day back then