Live worker job still running after pausing session

I’m having an issue right now where a worker for some reason is still “running”, though the Live session is both “paused” and “completed”. It’s blocking a GPU and it would be nice to be able to kill it directly from the GUI.

Perhaps there could be a warning when killing a process that might mess up a project/session. Or making the option available only to admins (they might still benefit from a warning).

Please can you post the output of the commands

csprojectid="P99" # substitute actual project ID
csjobid="J999" # substitute actual job ID of Live Worker job
cryosparcm cli "get_job('$csprojectid', '$csjobid', 'type', 'version', 'status',  'parents', 'created_at', 'started_at')"

Was this job created before the update to CryoSPARC v4.7.0?

{'_id': '6819d1c882b2a9d4423675ae', 'created_at': 'Tue, 06 May 2025 09:09:28 GMT', 'parents': ['J1'], 'project_uid': 'P115', 'started_at': 'Tue, 06 May 2025 09:09:31 GMT', 'status': 'failed', 'type': 'rtp_worker', 'uid': 'J11', 'version': 'v4.7.0'}

I had run ‘cryosparcm restart’ prior to this, after which the worker changed status to failed and released the GPU, so I’m not sure the output is useful. If I see it again I can run the commands right away. The project was created in v4.7.0.

We’ve had a related problem with Live Workers hanging, this time due to a users storage quota filling up and causing the session to stall. This has left the workers in a state of Launced, though they are not processing anything and don’t seem to have an active process on the GPUs.

Output from the command:
{'_id': '68220435353792a5e90e30c1', 'created_at': 'Mon, 12 May 2025 14:22:45 GMT', 'parents': ['J1'], 'project_uid': 'P710', 'started_at': None, 'status': 'launched', 'type': 'rtp_worker', 'uid': 'J14', 'version': 'v4.7.0'}

I’ve tried stopping and starting the cryosparcm without resolving the issue. Restarting the server also fails to kill the processes. It blocks us from using the nodes for anything in Cryosparc.

As a workaround, I’ve used Killing or removing a project - #4 by wtempel
to set the worker jobs as failed, which allowed me to use Cryosparc for other things again.