Hi, I have been getting an out of memory error when running 3DVA:
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY
the job runs through the 2 Initial reconstruction steps without issue. The error starts at the start of iteration 0.
I updated all drivers and to the latest cryosparc hoping it would solve the issue but it did not help. I have 2 RTX2070 GPUs. I don’t get this error on a newer machine that has 2 RTX3090 GPUs installed.
Is there anything I can do about this or the RTX2070 just physically doesn’t have enough memory to run 3DVA? All other jobs run without issues (patch motion and ctf, NU refinements etc.)
8GB of RAM really struggles with complex stuff like 3DVA and 3Dflex. If the box size is small enough, it might be OK, but I’m very close to decommissioning a long-serving quad-RTX2080 box because 8GB just isn’t enough for more complex tasks now. The 2070 will suffer a similar problem.
Hi again,
I spoke too soon. Actually this sample crashed also when i switched to our newer machine with the RTX3090s.
It made it to iteration 3 at least before crashing… the only error message is: Job is unresponsive - no heartbeat received in 180 seconds. And actually it completely crashes cryosparc. I need to reboot the machine to start cryosparc again.
I thought the newer machine would be fine since I ran 3dva on a different dataset of the same protein complex without any issue. The only difference between the 2 datasets is the current dataset has a substrate bound (and I am using cryosparc v4.4 instead of v4.3). Otherwise the 2 datasets are very similar (similar data collection parameters, extraction box size, pixel size etc)
does anyone know what the problem could be? or how I could trouble shoot this?
thanks!
Ben
What is your box size? Even on 3090s, 3D-VA will often run out of memory with box sizes >300pix (400pix sometimes ok but is a stretch). Have you tried downsampling your particles?
box is 500… I thought to do downsample but a 500 box size ran fine on a previous dataset with ~175000 particles. this data has ~80000 particles… I didn’t think the 500 box size would be a problem then…
ok thanks. I will try that
but why didn’t it crash with the previous dataset with a 500 box size then? I ran many 3dva runs fiddling with the parameters and the runs never crashed with the 500box size before…
also. should I rerun the consensus refinement with the downsampled particles before running the 3dva? or is it safe to just feed the downsampled particles directly into the 3dva job?