Micrograph denoising fails with low numbers of frames

Hi CryoSPARC team,

If Output denoiser training data is enabled during Patch Motion Correction with micrographs containing a small number of frames (e.g. EMPIAR 10031) it will fail (nearly) silently with an unhelpful error of Child process failed in the job log. dmesg contains:

[697553.165906] NVRM: sysmemConstruct_IMPL: *** Cannot allocate sysmem through fb heap
[697553.165922] NVRM: nvAssertOkFailedNoLog: Assertion failed: Out of memory [NV_ERR_NO_MEMORY] (0x00000051) returned from pRmApi->Alloc(pRmApi, device->session->handle, isSystemMemory ? device->handle : device->subhandle, &physHandle, isSystemMemory ? NV01_MEMORY_SYSTEM : NV01_MEMORY_LOCAL_USER, &memAllocParams, sizeof(memAllocParams)) @ nv_gpu_ops.c:4647
[697556.775391] NVRM: nvCheckOkFailedNoLog: Check failed: Out of memory [NV_ERR_NO_MEMORY] (0x00000051) returned from _memdescAllocInternal(pMemDesc) @ mem_desc.c:1353

As soon as denoising output is disabled, Patch Motion Correction completes without issue.

I’ve not tested further at which point the number of frames causes this failure.

@rbs_sci Please can you email us the job report for the failed job and post the output of the command nvidia-smi on the worker where the job ran.

Urgh. Now I’m very confused.

I cleared and re-built the job (which was failing) and had it run successfully yesterday (without denoiser training), but creating a new Patch Motion job this morning is working successfully, generating denoiser training data, with both low memory mode enabled and disabled.

Sorry, seems like something got muddled up with the “clear/re-queue”.

Thanks @rbs_sci for the update. Please let us know if/when you encounter this issue again.