After updating to version 4.4.1, I encountered an annoying problem, i.e., my queued jobs just stayed in the queue and wouldn’t launch even after the inputs were ready. I have looked into other similar posts in the forum but didn’t find a fix. I am using an HPC system.
Here is an example from the “Extensive Validation.”
As you can see, J44 won’t launch even after J43 is long finished. Ran command “cryosparcm joblog P4 J44
”, the error message is “/home/zhuj6/cryoEM/Test/CS-t20s/J44/job.log: No such file or directory
”. Here is the file list in the J44 directory.
events.bson gridfs_data job.json
No submission script was created under J44. I didn’t have this problem when running version 4.3.x. I have also reinstalled version 4.4.1 after the first time, but this problem still remains. I would really like to get some help here. Thanks!
Here are the last lines from “command_core
”.
2023-12-26 09:49:41,508 dump_job_database INFO | Request to export P4 J43
2023-12-26 09:49:41,515 dump_job_database INFO | Exporting job to /home/zhuj6/cryoEM/Test/CS-t20s/J43
2023-12-26 09:49:41,516 dump_job_database INFO | Exporting all of job's images in the database to /home/zhuj6/cryoEM/Test/CS-t20s/J43/gridfs_data...
2023-12-26 09:49:41,517 dump_job_database INFO | Done. Exported 0 images in 0.00s
2023-12-26 09:49:41,517 dump_job_database INFO | Exporting all job's streamlog events...
2023-12-26 09:49:41,522 dump_job_database INFO | Done. Exported 1 files in 0.00s
2023-12-26 09:49:41,522 dump_job_database INFO | Exporting job metafile...
2023-12-26 09:49:41,527 dump_job_database INFO | Done. Exported in 0.01s
2023-12-26 09:49:41,527 dump_job_database INFO | Updating job manifest...
2023-12-26 09:49:41,532 dump_job_database INFO | Done. Updated in 0.00s
2023-12-26 09:49:41,532 dump_job_database INFO | Exported P4 J43 in 0.02s
2023-12-26 09:49:41,533 run INFO | Completed task in 0.024985074996948242 seconds
2023-12-26 09:49:52,085 run INFO | Received task dump_job_database with 2 args and 0 kwargs
2023-12-26 09:49:52,086 dump_job_database INFO | Request to export P4 J44
2023-12-26 09:49:52,091 dump_job_database INFO | Exporting job to /home/zhuj6/cryoEM/Test/CS-t20s/J44
2023-12-26 09:49:52,092 dump_job_database INFO | Exporting all of job's images in the database to /home/zhuj6/cryoEM/Test/CS-t20s/J44/gridfs_data...
2023-12-26 09:49:52,093 dump_job_database INFO | Done. Exported 0 images in 0.00s
2023-12-26 09:49:52,093 dump_job_database INFO | Exporting all job's streamlog events...
2023-12-26 09:49:52,098 dump_job_database INFO | Done. Exported 1 files in 0.00s
2023-12-26 09:49:52,098 dump_job_database INFO | Exporting job metafile...
2023-12-26 09:49:52,105 dump_job_database INFO | Done. Exported in 0.01s
2023-12-26 09:49:52,106 dump_job_database INFO | Updating job manifest...
2023-12-26 09:49:52,111 dump_job_database INFO | Done. Updated in 0.01s
2023-12-26 09:49:52,111 dump_job_database INFO | Exported P4 J44 in 0.03s
2023-12-26 09:49:52,112 run INFO | Completed task in 0.026586294174194336 seconds
2023-12-26 09:50:07,360 scheduler_run_core INFO | Running...
2023-12-26 09:50:07,360 scheduler_run_core INFO | Jobs Queued: [('P4', 'J43')]
2023-12-26 09:50:07,363 scheduler_run_core INFO | Licenses currently active : 4
2023-12-26 09:50:07,363 scheduler_run_core INFO | Now trying to schedule J43
2023-12-26 09:50:07,364 scheduler_run_core INFO | Scheduling directly onto master node cn1577
2023-12-26 09:50:07,364 scheduler_run_job_master_direct INFO | Scheduling directly onto master node cn1577
2023-12-26 09:50:08,344 scheduler_run_job_master_direct INFO | Not a commercial instance - heartbeat set to 12 hours.
2023-12-26 09:50:08,431 set_job_status INFO | Status changed for P4.J43 from queued to launched
2023-12-26 09:50:08,432 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J43 with body {'projectUid': 'P4', 'jobUid': 'J43'}
2023-12-26 09:50:08,437 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 09:50:08,471 run_job INFO | Running P4 J43
2023-12-26 09:50:08,472 run_job INFO | Running job on master node directly
2023-12-26 09:50:08,473 run_job INFO | Running job using: /vf/users/zhuj6/apps/cryosparc/cryosparc_master/bin/cryosparcm
2023-12-26 09:50:08,491 scheduler_run_core INFO | Finished
2023-12-26 09:50:18,701 set_job_status INFO | Status changed for P4.J43 from launched to started
2023-12-26 09:50:18,702 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J43 with body {'projectUid': 'P4', 'jobUid': 'J43'}
2023-12-26 09:50:18,709 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 09:50:27,537 set_job_status INFO | Status changed for P4.J43 from started to running
2023-12-26 09:50:27,539 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J43 with body {'projectUid': 'P4', 'jobUid': 'J43'}
2023-12-26 09:50:27,544 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 09:50:27,740 dump_project INFO | Exporting project P4
2023-12-26 09:50:27,747 dump_project INFO | Exported project P4 to /home/zhuj6/cryoEM/Test/CS-t20s/project.json in 0.01s
2023-12-26 09:51:02,314 dump_job_database INFO | Request to export P4 J43
2023-12-26 09:51:02,316 dump_job_database INFO | Exporting job to /home/zhuj6/cryoEM/Test/CS-t20s/J43
2023-12-26 09:51:02,317 dump_job_database INFO | Exporting all of job's images in the database to /home/zhuj6/cryoEM/Test/CS-t20s/J43/gridfs_data...
2023-12-26 09:51:02,341 dump_job_database INFO | Writing 7 database images to /home/zhuj6/cryoEM/Test/CS-t20s/J43/gridfs_data/gridfsdata_0
2023-12-26 09:51:02,341 dump_job_database INFO | Done. Exported 7 images in 0.02s
2023-12-26 09:51:02,341 dump_job_database INFO | Exporting all job's streamlog events...
2023-12-26 09:51:02,347 dump_job_database INFO | Done. Exported 1 files in 0.01s
2023-12-26 09:51:02,347 dump_job_database INFO | Exporting job metafile...
2023-12-26 09:51:02,348 dump_job_database INFO | Creating .csg file for imported_movies
2023-12-26 09:51:02,362 dump_job_database INFO | Done. Exported in 0.01s
2023-12-26 09:51:02,362 dump_job_database INFO | Updating job manifest...
2023-12-26 09:51:02,367 dump_job_database INFO | Done. Updated in 0.00s
2023-12-26 09:51:02,367 dump_job_database INFO | Exported P4 J43 in 0.05s
2023-12-26 09:51:02,384 set_job_status INFO | Status changed for P4.J43 from running to completed
2023-12-26 09:51:02,387 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J43 with body {'projectUid': 'P4', 'jobUid': 'J43'}
2023-12-26 09:51:02,392 app_stats_refresh INFO | code 200, text {"success":true}
Here is a new thing. J44 was queued for a few hours and doing nothing. When I was working on another project and submitting another job, J44 was submitted as well at the same time. Here is the continued log.
2023-12-26 09:51:02,367 dump_job_database INFO | Exported P4 J43 in 0.05s
2023-12-26 09:51:02,384 set_job_status INFO | Status changed for P4.J43 from running to completed
2023-12-26 09:51:02,387 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J43 with body {'projectUid': 'P4', 'jobUid': 'J43'}
2023-12-26 09:51:02,392 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 12:14:46,761 run INFO | Received task dump_job_database with 2 args and 0 kwargs
2023-12-26 12:14:46,761 dump_job_database INFO | Request to export P13 J9
2023-12-26 12:14:46,770 dump_job_database INFO | Exporting job to /home/zhuj6/cryoEM/SRRP/CS-srrp/J9
2023-12-26 12:14:46,779 dump_job_database INFO | Exporting all of job's images in the database to /home/zhuj6/cryoEM/SRRP/CS-srrp/J9/gridfs_data...
2023-12-26 12:14:46,788 dump_job_database INFO | Done. Exported 0 images in 0.01s
2023-12-26 12:14:46,788 dump_job_database INFO | Exporting all job's streamlog events...
2023-12-26 12:14:46,795 dump_job_database INFO | Done. Exported 1 files in 0.01s
2023-12-26 12:14:46,795 dump_job_database INFO | Exporting job metafile...
2023-12-26 12:14:46,807 dump_job_database INFO | Done. Exported in 0.01s
2023-12-26 12:14:46,808 dump_job_database INFO | Updating job manifest...
2023-12-26 12:14:46,819 dump_job_database INFO | Done. Updated in 0.01s
2023-12-26 12:14:46,819 dump_job_database INFO | Exported P13 J9 in 0.06s
2023-12-26 12:14:46,820 run INFO | Completed task in 0.059456825256347656 seconds
2023-12-26 12:14:57,341 param_set_spec_value INFO | Setting parameter J9.diameter_max with value 150 of python type <class 'int'> and param type number
2023-12-26 12:14:57,344 run INFO | Received task dump_job_database with 2 args and 0 kwargs
2023-12-26 12:14:57,345 dump_job_database INFO | Request to export P13 J9
2023-12-26 12:14:57,348 dump_job_database INFO | Exporting job to /home/zhuj6/cryoEM/SRRP/CS-srrp/J9
2023-12-26 12:14:57,349 dump_job_database INFO | Exporting all of job's images in the database to /home/zhuj6/cryoEM/SRRP/CS-srrp/J9/gridfs_data...
2023-12-26 12:14:57,351 dump_job_database INFO | Done. Exported 0 images in 0.00s
2023-12-26 12:14:57,351 dump_job_database INFO | Exporting all job's streamlog events...
2023-12-26 12:14:57,357 dump_job_database INFO | Done. Exported 1 files in 0.01s
2023-12-26 12:14:57,357 dump_job_database INFO | Exporting job metafile...
2023-12-26 12:14:57,363 dump_job_database INFO | Done. Exported in 0.01s
2023-12-26 12:14:57,363 dump_job_database INFO | Updating job manifest...
2023-12-26 12:14:57,368 dump_job_database INFO | Done. Updated in 0.00s
2023-12-26 12:14:57,368 dump_job_database INFO | Exported P13 J9 in 0.02s
2023-12-26 12:14:57,369 run INFO | Completed task in 0.024834632873535156 seconds
2023-12-26 12:15:04,296 scheduler_run_core INFO | Running...
2023-12-26 12:15:04,296 scheduler_run_core INFO | Jobs Queued: [('P4', 'J44'), ('P13', 'J9')]
2023-12-26 12:15:04,299 scheduler_run_core INFO | Licenses currently active : 4
2023-12-26 12:15:04,299 scheduler_run_core INFO | Now trying to schedule J44
2023-12-26 12:15:04,299 scheduler_run_job INFO | Scheduling job to slurm
2023-12-26 12:15:05,261 scheduler_run_job INFO | Not a commercial instance - heartbeat set to 12 hours.
2023-12-26 12:15:05,346 scheduler_run_job INFO | Launchable! -- Launching.
2023-12-26 12:15:05,351 set_job_status INFO | Status changed for P4.J44 from queued to launched
2023-12-26 12:15:05,352 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J44 with body {'projectUid': 'P4', 'jobUid': 'J44'}
2023-12-26 12:15:05,359 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 12:15:05,399 run_job INFO | Running P4 J44
2023-12-26 12:15:05,415 run_job INFO | cmd: /usr/local/slurm/bin/sbatch /home/zhuj6/cryoEM/Test/CS-t20s/J44/queue_sub_script.sh
2023-12-26 12:15:06,683 scheduler_run_core INFO | Licenses currently active : 5
2023-12-26 12:15:06,684 scheduler_run_core INFO | Now trying to schedule J9
2023-12-26 12:15:06,684 scheduler_run_job INFO | Scheduling job to slurm
2023-12-26 12:15:07,663 scheduler_run_job INFO | Not a commercial instance - heartbeat set to 12 hours.
2023-12-26 12:15:07,747 scheduler_run_job INFO | Launchable! -- Launching.
2023-12-26 12:15:07,752 set_job_status INFO | Status changed for P13.J9 from queued to launched
2023-12-26 12:15:07,753 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P13, workspace_uid None, job_uid J9 with body {'projectUid': 'P13', 'jobUid': 'J9'}
2023-12-26 12:15:07,762 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 12:15:07,796 run_job INFO | Running P13 J9
2023-12-26 12:15:07,811 run_job INFO | cmd: /usr/local/slurm/bin/sbatch /home/zhuj6/cryoEM/SRRP/CS-srrp/J9/queue_sub_script.sh
2023-12-26 12:15:09,229 scheduler_run_core INFO | Finished
2023-12-26 12:15:16,910 run INFO | Received task dump_job_database with 2 args and 0 kwargs
2023-12-26 12:15:16,910 dump_job_database INFO | Request to export P13 J10
2023-12-26 12:15:16,916 dump_job_database INFO | Exporting job to /home/zhuj6/cryoEM/SRRP/CS-srrp/J10
2023-12-26 12:15:16,917 dump_job_database INFO | Exporting all of job's images in the database to /home/zhuj6/cryoEM/SRRP/CS-srrp/J10/gridfs_data...
2023-12-26 12:15:16,918 dump_job_database INFO | Done. Exported 0 images in 0.00s
2023-12-26 12:15:16,918 dump_job_database INFO | Exporting all job's streamlog events...
2023-12-26 12:15:16,922 dump_job_database INFO | Done. Exported 1 files in 0.00s
2023-12-26 12:15:16,923 dump_job_database INFO | Exporting job metafile...
2023-12-26 12:15:16,928 dump_job_database INFO | Done. Exported in 0.01s
2023-12-26 12:15:16,928 dump_job_database INFO | Updating job manifest...
2023-12-26 12:15:16,933 dump_job_database INFO | Done. Updated in 0.00s
2023-12-26 12:15:16,933 dump_job_database INFO | Exported P13 J10 in 0.02s
2023-12-26 12:15:16,934 run INFO | Completed task in 0.023966312408447266 seconds
2023-12-26 12:15:51,296 set_job_status INFO | Status changed for P4.J44 from launched to started
2023-12-26 12:15:51,297 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J44 with body {'projectUid': 'P4', 'jobUid': 'J44'}
2023-12-26 12:15:51,303 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 12:15:54,679 set_job_status INFO | Status changed for P4.J44 from started to running
2023-12-26 12:15:54,680 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J44 with body {'projectUid': 'P4', 'jobUid': 'J44'}
2023-12-26 12:15:54,685 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 12:15:57,647 set_job_status INFO | Status changed for P13.J9 from launched to started
2023-12-26 12:15:57,648 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P13, workspace_uid None, job_uid J9 with body {'projectUid': 'P13', 'jobUid': 'J9'}
2023-12-26 12:15:57,655 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 12:16:06,167 set_job_status INFO | Status changed for P13.J9 from started to running
2023-12-26 12:16:06,168 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P13, workspace_uid None, job_uid J9 with body {'projectUid': 'P13', 'jobUid': 'J9'}
2023-12-26 12:16:06,175 app_stats_refresh INFO | code 200, text {"success":true}
2023-12-26 12:22:00,671 dump_job_database INFO | Request to export P4 J44
2023-12-26 12:22:00,675 dump_job_database INFO | Exporting job to /home/zhuj6/cryoEM/Test/CS-t20s/J44
2023-12-26 12:22:00,676 dump_job_database INFO | Exporting all of job's images in the database to /home/zhuj6/cryoEM/Test/CS-t20s/J44/gridfs_data...
2023-12-26 12:22:00,772 dump_job_database INFO | Writing 53 database images to /home/zhuj6/cryoEM/Test/CS-t20s/J44/gridfs_data/gridfsdata_0
2023-12-26 12:22:00,772 dump_job_database INFO | Done. Exported 53 images in 0.10s
2023-12-26 12:22:00,772 dump_job_database INFO | Exporting all job's streamlog events...
2023-12-26 12:22:00,780 dump_job_database INFO | Done. Exported 1 files in 0.01s
2023-12-26 12:22:00,780 dump_job_database INFO | Exporting job metafile...
2023-12-26 12:22:00,782 dump_job_database INFO | Creating .csg file for micrographs
2023-12-26 12:22:00,800 dump_job_database INFO | Done. Exported in 0.02s
2023-12-26 12:22:00,801 dump_job_database INFO | Updating job manifest...
2023-12-26 12:22:00,806 dump_job_database INFO | Done. Updated in 0.01s
2023-12-26 12:22:00,807 dump_job_database INFO | Exported P4 J44 in 0.14s
2023-12-26 12:22:00,824 set_job_status INFO | Status changed for P4.J44 from running to completed
2023-12-26 12:22:00,828 app_stats_refresh INFO | Calling app stats refresh url http://cn1577:39022/api/actions/stats/refresh_job for project_uid P4, workspace_uid None, job_uid J44 with body {'projectUid': 'P4', 'jobUid': 'J44'}
2023-12-26 12:22:00,833 app_stats_refresh INFO | code 200, text {"success":true}