4.3.0 benchmark out-of-memory on slurm cluster

I’ve tried to perform the new benchmark task on cluster (slurm) environment and encountered an issue with memory limits.

The job reserves, by default, 16GB of memory and gets killed pretty quickly on Cross Validation Benchmark step. At the time of oom-kill the job uses around 26GB.

It seems like it either reserves not enough memory or reads available mem for the whole cluster node instead of the slurm job.

I know how to tune the submission script to multiply the memory requested, but I believe that the job should reserve enough memory by default.

Hi @bsobol,

Thanks for reporting. We’ve taken note, and a fix for this will be available in the next release.

This has been resolved in v4.4, released today.

2 Likes