Memory error again

open

#1

Hi, I am getting “memory error” again, posted similar error back on Dec. 18th. This time run job with 550 box size images, 600 failed after 0 iteration. 550 box job ran 9 iterations last night and failed on 10th. Restarted with last map calculated, had to import it as 3D volume. this time it ran successfully 5 iterations and failed again with error:
Done iteration 5 in 5034.457s. Total time so far 9776.073s

– Iteration 6

Using Full Dataset (split 14232 in A, 14231 in B)

Using Max Alignment Radius 205.958 (3.088A)

Using dynamic mask.

– DEV 0 THR 0 NUM 7231 TOTAL 2213.7991 ELAPSED 4429.5787 –

Processed 28463.000 images in 4433.721s.

Computing FSCs…

Done in 282.455s

Optimizing FSC Mask…
Traceback (most recent call last): File “cryosparc2_worker/cryosparc2_compute/run.py”, line 78, in cryosparc2_compute.run.main File “cryosparc2_worker/cryosparc2_compute/jobs/refine/run.py”, line 390, in cryosparc2_compute.jobs.refine.run.run_homo_refine File “cryosparc2_compute/sigproc.py”, line 909, in find_best_fsc tight_near_ang=near, tight_far_ang=far, initmask=initmask) File “cryosparc2_compute/sigproc.py”, line 885, in compute_all_fscs radwns, fsc_true, fsc_noisesub = noise_sub_fsc (rMA, rMB, mask, radwn_noisesub_start, radwn_max) File “cryosparc2_compute/sigproc.py”, line 762, in noise_sub_fsc fMBrand = fourier.fft(fourier.ifft(randomphases(fourier.fft(MB), radwn_ns)) * mask) File “cryosparc2_compute/sigproc.py”, line 757, in randomphases return fM * mask + randphases * amp * (~mask) MemoryError.
I am using 2.4.2 version, CentOS 7 (64 bit, 64 Gb RAM, M6000 GPU with 24 Gb memory, CUDA 8, no SSD).
could somebody tell me what went wrong? Is it possible to continue if previous job failed at some iteration step?


#2

Hi @mbs,

Sorry for the trouble with this - unfortunately the memory error is happening on the CPU (I.e. system ram is running out). This should not really be happening (your box size is not unreasonably large) but there are several inefficiencies in our FSC computing code that waste memory. We are hoping to fix this (or accelerate on the GPU) but for now unfortunately the only solution is for you to run the job ensuring that nothing else on the machine is using system RAM (no other cryosparc jobs or jobs of any kind), or to acquire more RAM for the machine.

Sorry!


#3

Hi @apunjani,
Thank you for reply. I understand a little more now what is happening with job fails. I use a different machine with more memory and could get further with refinement, although not much. Could you give me an idea how much memory you think I would need for 600^3 refinement? Both, uniform- and non-uniform ones? It s looks so far that all jobs fail at FSC step. And, no other jobs were running at that time on the machines I used so far, I was the only user at the time.
Thank you,
Michael