Hi all,
Our SLURM configuration uses a lua submit script (in SLURM as hook/plugin) that will sometimes output to stderr (this is standard SLURM functionality, see https://slurm.schedmd.com/job_submit_plugins.html)
cryosparc parses for the last word of the output in the first line of stdout+stderr output, i.e. it looks like this, typically:
bigcluster$ sbatch myscript.sh
Hello user, this is some information. <--- on stderr
Submitted batch job 17254227. <--- on stdout
We ended up with cryosparc registering the string “information.” as slurm jobid.
It brings the job in a state where it cannot be tracked (as we have no job id) and the user cannot kill or otherwise reset the job.
cryosparcm cli "cli.update_job('Pxxx', 'Jyyy', {'status': 'failed'})"
helped here, to fail them and allow restarting the jobs after we removed the user information in that case.
It would be nice to only parse stdout here - this might be affecting other sites too.
Edit: our cryosparc version is 3.1
Best,
Erich