Unable to run jobs in a particular project

I had my project 24 in Cryosparc running but suddenly, I started getting error mentioning that the cs.lock file could not be found. Although the cs.lock file was still present in the directory, I still couldn’t run any jobs. Then I ran
cryosparcm cli “take_over_project(‘P24’)”
And now all the jobs come back with the following error:


Would really appreciate if someone has any ideas about how to fix this problem.

Welcome to the forum @sakshi.
Please post a longer excerpt from the Event Log for additional context, and relevant information you may find under Metadata|Log.

Please post log excerpts and command outputs as text to facilitate text searches of the forum.

What error messages did you see when you could not run any jobs?

Is this the exact command you used? Did it produce an error as is expected if a cs.lock file already exists?

What are the outputs of these commands

ps -eouser,comm | grep supervisord
ls -l $(cryosparcm cli "get_project_dir_abs('P24')")

I am new here and wasn’t aware about it.

[CPU: 225.4 MB Avail: 20.88 GB]

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 96, in cryosparc_compute.run.main
File “cryosparc_master/cryosparc_compute/jobs/class2D/run.py”, line 54, in cryosparc_compute.jobs.class2D.run.run_class_2D
File “/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/particles.py”, line 63, in init
self._reset(Data(d._data)) # copy all data
AttributeError: ‘NoneType’ object has no attribute ‘_data’

Project P24 Job J46

Master Plato Port 61002

========= monitor process now starting main process at 2023-09-26 23:21:28.223448
MAINPROCESS PID 18473
========= monitor process now waiting for main process
MAIN PID 18473
class2D.run cryosparc_compute.jobs.jobregister


Running job J46 of type class_2D
Running job on hostname %s Plato
Allocated Resources : {‘fixed’: {‘SSD’: False}, ‘hostname’: ‘Plato’, ‘lane’: ‘default’, ‘lane_type’: ‘node’, ‘license’: True, ‘licenses_acquired’: 1, ‘slots’: {‘CPU’: [2, 3], ‘GPU’: [1], ‘RAM’: [1, 2, 3]}, ‘target’: {‘cache_path’: ‘/mnt/Storage/scratch’, ‘cache_quota_mb’: 2000000, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, ‘mem’: 25435111424, ‘name’: ‘NVIDIA GeForce RTX 3090’}, {‘id’: 1, ‘mem’: 25438126080, ‘name’: ‘NVIDIA GeForce RTX 3090’}], ‘hostname’: ‘Plato’, ‘lane’: ‘default’, ‘monitor_port’: None, ‘name’: ‘Plato’, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63], ‘GPU’: [0, 1], ‘RAM’: [0, 1, 2, 3]}, ‘ssh_str’: ‘cryosparc@Plato’, ‘title’: ‘Worker node Plato’, ‘type’: ‘node’, ‘worker_bin_path’: ‘/home/cryosparc/cryosparc/cryosparc_worker/bin/cryosparcw’}}
**** handle exception rc
Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 96, in cryosparc_compute.run.main
File “cryosparc_master/cryosparc_compute/jobs/class2D/run.py”, line 54, in cryosparc_compute.jobs.class2D.run.run_class_2D
File “/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/particles.py”, line 63, in init
self._reset(Data(d._data)) # copy all data
AttributeError: ‘NoneType’ object has no attribute ‘_data’
set status to failed
========= main process now complete at 2023-09-26 23:21:32.078083.
========= monitor process now complete at 2023-09-26 23:21:32.081138.

I have a similar error now with Project 22 : I can’t run any jobs as it has the following error which was similar to P24
Unable to kill P22 J237: ServerError: validation error: lock file for P22 at /media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607/cs.lock absent or otherwise inaccessible.

Yes, I used exactly this command. It just said ‘true’.

-rw-rw-r-- 1 cryosparc cryosparc 76 Sep 26 10:17 cs.lock
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 16:08 J34
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:05 J36
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:06 J37
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 16:07 J38
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:05 J39
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:05 J40
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:06 J41
drwxrwxr-x 4 cryosparc cryosparc 4096 Sep 25 18:07 J42
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:08 J43
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 26 10:12 J44
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 26 10:17 J45
-rw-rw-r-- 1 cryosparc cryosparc 688 Sep 26 10:17 job_manifest.json
-rw-rw-r-- 1 cryosparc cryosparc 2120 Sep 26 18:25 project.json

If you have not yet run take_over_project(), please can you log on as cryosparc and run these commands:

id
ls -l $(cryosparcm cli "get_project_dir_abs('P24')")
stat -f $(cryosparcm cli "get_project_dir_abs('P24')")

and post the outputs.

uid=1002(cryosparc) gid=1002(cryosparc) groups=1002(cryosparc)

-rw-rw-r-- 1 cryosparc cryosparc 76 Sep 26 10:17 cs.lock
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 16:08 J34
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:05 J36
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:06 J37
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 16:07 J38
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:05 J39
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:05 J40
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:06 J41
drwxrwxr-x 4 cryosparc cryosparc 4096 Sep 25 18:07 J42
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 25 18:08 J43
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 26 10:12 J44
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 26 10:17 J45
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 26 23:21 J46
drwxrwxr-x 3 cryosparc cryosparc 4096 Sep 27 09:22 J47
-rw-rw-r-- 1 cryosparc cryosparc 718 Sep 27 09:22 job_manifest.json
-rw-rw-r-- 1 cryosparc cryosparc 2120 Sep 26 18:25 project.json

File: “/media/lutz/df0ef1d9-9b1d-430d-a9d7-445b661427461/20230907_ATP_MgCl2/Processing/CS-xx”
ID: e3bfe859257f0c9c Namelen: 255 Type: ext2/ext3
Block size: 4096 Fundamental block size: 4096
Blocks: Total: 470754772 Free: 87042949 Available: 63111468
Inodes: Total: 119644160 Free: 117561118

Please accept my apologies @sakshi. I meant to ask for these commands with respect to the P22 project.
If you have not yet run take_over_project('P22'), please can you run these commands and post their output.

id
ls -l $(cryosparcm cli "get_project_dir_abs('P22')")
stat -f $(cryosparcm cli "get_project_dir_abs('P22')")

I am interested in the state of cs.lock before the takeover.

No problem. I also got confused as both P22 and P24 are not working currently.

uid=1002(cryosparc) gid=1002(cryosparc) groups=1002(cryosparc)

drwxrwxrwx 3 cryosparc cryosparc 4096 Sep 26 17:12 J236
-rwxrwxrwx 1 cryosparc cryosparc 3706 Sep 26 17:12 job_manifest.json
-rwxrwxrwx 1 cryosparc cryosparc 2387 Sep 26 18:25 project.json

File: “/media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607”
ID: e3bfe859257f0c9c Namelen: 255 Type: ext2/ext3
Block size: 4096 Fundamental block size: 4096
Blocks: Total: 470754772 Free: 87041875 Available: 63110394
Inodes: Total: 119644160 Free: 117561108

Thanks @sakshi.
It seems that the P22 directory does not hold the expected cs.lock file. Was

  • cs.lock manually deleted (not recommended)?
  • CS-reprocessing-10kk-krios-20230607/ previously detached from another CryoSPARC instance?

It is possible that the AttributeError: 'Nonetype' object has no attribute '_data' is not directly related to the cs.lock issue. Maybe the inputs connected to the 2D classification job were no longer intact. Please can you post the expanded Inputs section for that job?

It was not deleted manually, neither was it previously detached. I don’t know how but the error started as soon as I started this job. Since then, neither can I delete this job nor start a new one.

@sakshi Please can you

  • confirm the job type that corresponds to the Inputs screenshot you posted
  • describe how you attempted to delete this job and how the failure of deletion manifests itself
  • describe what you observed when you attempted to start a new job
  • examine the command_core log for errors

Its Extract from Micrographs

On the right panel, Details < Actions < Kill job < Yes
When I do the above, on the left panel I get a small pop-up ‘Unable to kill P22 J237: ServerError: validation error: lock file for P22 at /media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607/cs.lock absent or otherwise inaccessible.’

‘Unable to create job: ServerError: validation error: lock file for P22 at /media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607/cs.lock absent or otherwise inaccessible.’

It has error on almost everyline. I have the last few lines underneath. Should I share the full log?
: get_username_by_id(user_id), ‘accessed_at’ : datetime.datetime.utcnow()}})
2023-09-28 14:05:51,699 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_co
mmand/commandcommon.py”, line 186, in wrapper
2023-09-28 14:05:51,699 wrapper ERROR | return func(*args, **kwargs)
2023-09-28 14:05:51,699 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py”, line 246, in wrapper
2023-09-28 14:05:51,699 wrapper ERROR | assert os.path.isfile(
2023-09-28 14:05:51,699 wrapper ERROR | AssertionError: validation error: lock file for P22 at /media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607/cs.lock absent or otherwise inaccessible.
2023-09-28 14:05:55,901 wrapper ERROR | JSONRPC ERROR at kill_job
2023-09-28 14:05:55,901 wrapper ERROR | Traceback (most recent call last):
2023-09-28 14:05:55,901 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py”, line 195, in wrapper
2023-09-28 14:05:55,901 wrapper ERROR | res = func(*args, **kwargs)
2023-09-28 14:05:55,901 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py”, line 246, in wrapper
2023-09-28 14:05:55,901 wrapper ERROR | assert os.path.isfile(
2023-09-28 14:05:55,901 wrapper ERROR | AssertionError: validation error: lock file for P22 at /media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607/cs.lock absent or otherwise inaccessible.
2023-09-28 14:07:04,502 wrapper ERROR | JSONRPC ERROR at kill_job
2023-09-28 14:07:04,502 wrapper ERROR | Traceback (most recent call last):
2023-09-28 14:07:04,502 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py”, line 195, in wrapper
2023-09-28 14:07:04,502 wrapper ERROR | res = func(*args, **kwargs)
2023-09-28 14:07:04,502 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py”, line 246, in wrapper
2023-09-28 14:07:04,502 wrapper ERROR | assert os.path.isfile(
2023-09-28 14:07:04,502 wrapper ERROR | AssertionError: validation error: lock file for P22 at /media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607/cs.lock absent or otherwise inaccessible.
2023-09-28 14:07:37,816 wrapper ERROR | JSONRPC ERROR at create_new_job
2023-09-28 14:07:37,816 wrapper ERROR | Traceback (most recent call last):
2023-09-28 14:07:37,816 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py”, line 195, in wrapper
2023-09-28 14:07:37,816 wrapper ERROR | res = func(*args, **kwargs)
2023-09-28 14:07:37,816 wrapper ERROR | File “/home/cryosparc/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py”, line 246, in wrapper
2023-09-28 14:07:37,816 wrapper ERROR | assert os.path.isfile(
2023-09-28 14:07:37,816 wrapper ERROR | AssertionError: validation error: lock file for P22 at /media/lutz/8a07c7d2-8d4c-4e40-9567-0df55af93979/20230607_10KK_ERC3/Processing/P22_Cryosparc_reprocessing/CS-reprocessing-10kk-krios-20230607/cs.lock absent or otherwise inaccessible.

@sakshi Thank you for posting this information. The errors are expected if cs.lock is missing, which we confirmed. The open questions are:

  1. Why is cs.lock missing? Was this project started when the CryoSPARC instance was at an older version (<4)? if not, this question requires a local investigation.

  2. Why are jobs failing once cs.lock has been added? My question

    pertained to P24 J46, which, I believe, is a 2D classification job.

The project was started in the current latest version. I don’t know the reason but we have the occurrence of cs.lock file missing already the third time.
How can we investigate the same and ensure that it doesn’t happen?

Yes, it is a 2D classification job.


I have tried different jobs with different inputs and all of them fail with the same error message.

That depends on your local circumstances; it is tough to give good advise on this topic remotely. If the filesystem that holds the project directory has snapshots, you could find out when the cs.lock gets deleted, and once you know when, you may be able to find out how.

The screenshot suggests the presence of a job J28, but a corresponding job directory is missing from the P24 (correct?) directory listing:

This would explain the