I’m having an issue right now where a worker for some reason is still “running”, though the Live session is both “paused” and “completed”. It’s blocking a GPU and it would be nice to be able to kill it directly from the GUI.
Perhaps there could be a warning when killing a process that might mess up a project/session. Or making the option available only to admins (they might still benefit from a warning).
Please can you post the output of the commands
csprojectid="P99" # substitute actual project ID
csjobid="J999" # substitute actual job ID of Live Worker job
cryosparcm cli "get_job('$csprojectid', '$csjobid', 'type', 'version', 'status', 'parents', 'created_at', 'started_at')"
Was this job created before the update to CryoSPARC v4.7.0?
{'_id': '6819d1c882b2a9d4423675ae', 'created_at': 'Tue, 06 May 2025 09:09:28 GMT', 'parents': ['J1'], 'project_uid': 'P115', 'started_at': 'Tue, 06 May 2025 09:09:31 GMT', 'status': 'failed', 'type': 'rtp_worker', 'uid': 'J11', 'version': 'v4.7.0'}
I had run ‘cryosparcm restart’ prior to this, after which the worker changed status to failed and released the GPU, so I’m not sure the output is useful. If I see it again I can run the commands right away. The project was created in v4.7.0.
We’ve had a related problem with Live Workers hanging, this time due to a users storage quota filling up and causing the session to stall. This has left the workers in a state of Launced, though they are not processing anything and don’t seem to have an active process on the GPUs.
Output from the command:
{'_id': '68220435353792a5e90e30c1', 'created_at': 'Mon, 12 May 2025 14:22:45 GMT', 'parents': ['J1'], 'project_uid': 'P710', 'started_at': None, 'status': 'launched', 'type': 'rtp_worker', 'uid': 'J14', 'version': 'v4.7.0'}
I’ve tried stopping and starting the cryosparcm without resolving the issue. Restarting the server also fails to kill the processes. It blocks us from using the nodes for anything in Cryosparc.