Some jobs are being qeueud to head node with slurm scheduler

RussellM · January 24, 2022, 9:07pm

Hello!

My lab has been successfully using cryoSPARC on a HPC cluster with a slurm scheduler for a few months now. Recently, our university changed some rules for requesting memory and the such and it came to our attention that several job types were being queued directly to the head node rather than to a worker node. This looks like an intentional behavior for stuff like particular curation, however it’s been a bit of a problem for us with jobs like symmetry expansion on larger particle stacks. I’m trying to figure out if there’s a way to direct cryoSPARC to send these jobs to the worker node rather than leaving them on the head node/VM that’s running cryosparc. Our cluster has a light/short term partition that is intended for this type of job as well.

Thank you,
Russell McFarland

Our cluster_info and cluster_scripts are as follows:

    "send_cmd_tpl": "{{ command }}",
    "qsub_cmd_tpl": "sbatch {{ script_path_abs }}",
    "qstat_cmd_tpl": "squeue -j {{ cluster_job_id }}",
    "qdel_cmd_tpl": "scancel {{ cluster_job_id }}",
    "qinfo_cmd_tpl": "sinfo --format='%.8N %.6D %.10P %.6T %.14C %.5c %.6z %.7m %.7G %.9d %20E'",
    "cache_path": "/home/exacloud/gscratch/lab/cryosparc_cache",
    "cache_quota_mb": 1000000,
    "cache_reserve_mb": 10000

#!/bin/bash
#SBATCH --job-name=cryosparc_{{ project_uid }}_{{ job_uid }}
#SBATCH --partition=gpu
#SBATCH --account=reichowlab
#SBATCH --output={{ job_log_path_abs }}
#SBATCH --error={{ job_log_path_abs }}
#SBATCH -N 1
#SBATCH --qos=normal
#SBATCH --mem=50G
#SBATCH -n {{num_cpu}}
#SBATCH --error={{ job_dir_abs }}/error.txt
#SBATCH --gres=gpu:{{num_gpu}}
#SBATCH --time=7-0

{{ run_cmd }}

vatese · January 25, 2022, 5:25am

Hi @RussellM,

Do you mean login node instead of head node? Running jobs on the same node running slurmctld is not commonly done in my experience.

This seems more like a discussion you should be having with your HPC sysadmin. Your QoS, account associations or partition routings might have changed.

That said, if you want to force Slurm to run on a specific node try adding #SBATCH -w <NODENAME> in your script template, but it might not be an optimal solution and could fail if incorrect partitions/QoS are selected.

Cheers.

RussellM · January 25, 2022, 5:10pm

I’m a little naive on the setup and HPC in general so I’ll try to clarify to my degree of understanding. The configuration that I’m using has the slurmctld process running on a VM on our cluster. This was decided by the HPC sysadmin, I don’t really know the reasoning/benefits but by and large the configuration has worked for me.

When queued up, most jobs will be queued up for GPU resources, as intended. However a few job types that are more ‘clerical’ in nature, such as subset selection and symmetry expansion, are not submitted to the queue and instead run by the VM that has slurmctld running. This has been a problem for specifically symmetry expand jobs where I have run out of memory and the job crashed (high order symmetries D6 and O with high particle counts 100k-1M+). Currently my workaround is to divide up particle stacks and run it in multiple jobs, but the sysadmin has asked for me to see if there is a better solution.

Russell

klemens.noga · January 26, 2022, 5:22pm

As far as I understand not all jobs in cryoSPARC are submitted to queues (or partitions in case of SLURM) trough cryoSPARC lanes. On cryoSPARC updates webpage there is information that:

Updated the job queuing interface to clarify which nodes certain jobs are launched on. Interactive jobs are always launched on the master node. Import jobs are launched on the master node unless CRYOSPARC_DISABLE_IMPORT_ON_MASTER is set to true, in which case import jobs can run on any lane/node selected. All other job types launch on a lane/node the user selects.

Therefore you should check whether setting CRYOSPARC_DISABLE_IMPORT_ON_MASTER variable is going to help. I thought that there was another variable to disable all not-interactive jobs from master and request them to be queued to SLURM partitions but now I cannot find it.

On our HPC resources at Cyfronet (https://kdm.cyfronet.pl/portal/Main_page) we’ve got special partition on cluster in which our cryoSPARC users could run long jobs (up to 14-days) with cryoSPARC master and schedule from it other SLURM jobs. It provides users with possibility to submit jobs with different requirements (up to whole node) for cryoSPARC master process depending on their needs. Maybe it could work also in your HPC system?

RussellM · January 27, 2022, 10:55pm

Thank you, I’ll definitely look into this and see if there’s something similar that can be done for symmetry expand and similar jobs.

– Russell