Removing Zombie Jobs

Hi all,

during our installation of a new node we did a couple of test runs and finally changed the registration from “cluster” to “internal managed node”. Now, we have a couple of zombie jobs, which cannot be killed, as they assume a command for killing (scancel) which is not available on the node anymore (no slurm anymore). So the “kill job” button does not work and we can also not mark the jobs as complete (button is greyes out).
Can we somehow get rid of these zombie jobs?

I think this is mostly a problem of our bad installation job :grimacing:

Best
Jan

You may try this for a hypothetical “zombie” job J123 in project P999 under the Linux account that runs the CryoSPARC instance:

  1. ensure the corresponding process on the compute node has been terminated
  2. cryosparcm cli "set_job_status('P999', 'J123', 'killed')"
  3. cryosparcm cli "set_job_status('P999', 'J123', 'completed')"

Dear wtempel,

this is exactly what I needed.
Do I really need to kill AND complete the jobs.
After killing them they are gone from the statistics…

Best and thanks
Jan

You don’t have to, unless you are interested in the given jobs preliminary outputs.