cryoSPARC not requesting enough memory for job

On 3.3.1, using SLURM. Submitted an extraction job at 512 box size with 2xA40 GPUs (48GB VRAM ea) to a node. cryoSPARC set the batch size to 1024. Job hung and gave Error 32 Broken Pipe after a couple micrographs.

I checked the job status with htop and noticed the swap was full. Doubled the memory limit by setting #SBATCH --mem={{ (ram_gb*1000)|int }}M to *2000 and got a few more micrographs in before the same error. Doubled it again to *4000 and job ran successfully.

It’d be nice if ram_gb was in advanced options or something, or maybe let me set the batch size? The workaround here unfortunately applies to all cryoSPARC jobs on the cluster, and might cause OOM on some other, more memory intensive jobs.

It does not have to. May I recommend specifying a separate “highmem” cluster lane (after modifying the "name": value in cluster_info.json).

2 Likes

Didn’t think of that - thanks.