How can I modify the status of jobs that have been deleted on the web interface but are still recorded as running in the database?

Hello, I have two jobs that can no longer be found on the web page. However, when I check the job information in the MongoDB database, I can still query these two jobs, and their status is “running”. I now want to delete these two entries from the database or modify their status to something else. But when I execute the operation, I get an error. The nodes where these two jobs previously ran have been taken offline from the cluster. May I ask if there is any way to resolve this issue?

[cryosparc@alogin05 CS-hh-ad]$ cryosparcm cli “delete_job(project_uid=‘P192’, job_uid=‘J247’, nofail=True, force=False)”
*** (``http://alogin05:39002``, code 400) Encountered ServerError from JSONRPC function “delete_job” with params {‘project_uid’: ‘P192’, ‘job_uid’: ‘J247’, ‘nofail’: True, ‘force’: False}:
ServerError: Command ‘[‘ssh’, ‘cryosparc@agpu63’, ‘kill’, ‘76833’]’ returned non-zero exit status 255.
Traceback (most recent call last):
File “/Share2/cryosparc/.software/cryosparc_master/cryosparc_command/commandcommon.py”, line 196, in wrapper
res = func(*args, **kwargs)
File “/Share2/cryosparc/.software/cryosparc_master/cryosparc_command/commandcommon.py”, line 265, in wrapper
return func(*args, **kwargs)
File “/Share2/cryosparc/.software/cryosparc_master/cryosparc_command/command_core/init.py”, line 3051, in delete_job
kill_job(project_uid, job_uid)
File “/Share2/cryosparc/.software/cryosparc_master/cryosparc_command/commandcommon.py”, line 187, in wrapper
return func(*args, **kwargs)
File “/Share2/cryosparc/.software/cryosparc_master/cryosparc_command/commandcommon.py”, line 233, in wrapper
return func(*args, **kwargs)
File “/Share2/cryosparc/.software/cryosparc_master/cryosparc_command/command_core/init.py”, line 3010, in kill_job
kill_remote_pid(ssh_str, job_doc[‘PID_monitor’])
File “/Share2/cryosparc/.software/cryosparc_master/cryosparc_command/command_core/init.py”, line 2948, in kill_remote_pid
subprocess.check_output([‘ssh’, ssh_str, ‘kill’, str(pid)], stderr=subprocess.STDOUT, timeout=JOB_LAUNCH_TIMEOUT_SECONDS).decode()
File “/Share2/cryosparc/.software/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/subprocess.py”, line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File “/Share2/cryosparc/.software/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/subprocess.py”, line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command ‘[‘ssh’, ‘cryosparc@agpu63’, ‘kill’, ‘76833’]’ returned non-zero exit status 255.

@zhenyuanliu Please can you let us know

  • the CryoSPARC version number
  • whether the jobs were deleted using the Delete Job UI action
  • how you initially noticed that the jobs were still in the running state

Hello. The version of cryoSPARC is v4.6.2. I cannot say for sure whether it was operated through the UI, but on our cluster, regular users should only be able to operate through the UI. Our cluster periodically collects statistics on user usage, and we discovered this while doing statistics when we found tasks that had been running for a long time.