Here is the logs that you’ve requested.
Job log:
> ========= sending heartbeat at 2023-07-24 16:22:25.361295
> ========= sending heartbeat at 2023-07-24 16:22:37.471547
> ========= sending heartbeat at 2023-07-24 16:22:49.460845
> ========= sending heartbeat at 2023-07-24 16:23:01.063566
> ========= sending heartbeat at 2023-07-24 16:23:15.300495
Event log:
> [CPU: 1.33 GB]
> -- 0.0: (3479 of 17355) processing J150/motioncorrected/010459585116581381098_FoilHole_29404620_Data_29377140_29377142_20230429_164609_EER_patch_aligned_doseweighted.mrc
> 5 particles extracted (5 rejected near edges)
> Writing to J208/extract/010459585116581381098_FoilHole_29404620_Data_29377140_29377142_20230429_164609_EER_patch_aligned_doseweighted_particles.mrc
> Total (asynchronous) processing time: 1.64s
>
> [CPU: 1.39 GB]
> -- 1.0: (3480 of 17355) processing J150/motioncorrected/008853555362268766809_FoilHole_29404620_Data_29377152_29377154_20230429_164557_EER_patch_aligned_doseweighted.mrc
> 2 particles extracted (0 rejected near edges)
> Writing to J208/extract/008853555362268766809_FoilHole_29404620_Data_29377152_29377154_20230429_164557_EER_patch_aligned_doseweighted_particles.mrc
> Total (asynchronous) processing time: 1.86s
>
> [CPU: 1.56 GB]
> -- 0.1: (3481 of 17355) processing J150/motioncorrected/010131049080804289702_FoilHole_29404621_Data_29376983_29376985_20230429_164657_EER_patch_aligned_doseweighted.mrc
>
> [CPU: 1.56 GB]
> -- 0.1: (3482 of 17355) processing J150/motioncorrected/011507918805205004349_FoilHole_29404621_Data_29377140_29377142_20230429_164703_EER_patch_aligned_doseweighted.mrc
> 1 particles extracted (4 rejected near edges)
> Writing to J208/extract/011507918805205004349_FoilHole_29404621_Data_29377140_29377142_20230429_164703_EER_patch_aligned_doseweighted_particles.mrc
> Total (asynchronous) processing time: 1.59s
>
> [CPU: 1.57 GB]
> -- 1.1: (3483 of 17355) processing J150/motioncorrected/010607545151755570022_FoilHole_29404621_Data_29377152_29377154_20230429_164651_EER_patch_aligned_doseweighted.mrc
>
> [CPU: 1.57 GB]
> -- 1.1: (3484 of 17355) processing J150/motioncorrected/016344257819808203700_FoilHole_29404622_Data_29376983_29376985_20230429_165133_EER_patch_aligned_doseweighted.mrc
> 1 particles extracted (0 rejected near edges)
> Writing to J208/extract/016344257819808203700_FoilHole_29404622_Data_29376983_29376985_20230429_165133_EER_patch_aligned_doseweighted_particles.mrc
> Total (asynchronous) processing time: 1.79s
>
> [CPU: 1.40 GB]
> -- 0.0: (3485 of 17355) processing J150/motioncorrected/003971386599620482875_FoilHole_29404622_Data_29377140_29377142_20230429_165139_EER_patch_aligned_doseweighted.mrc
> 1 particles extracted (1 rejected near edges)
> Writing to J208/extract/003971386599620482875_FoilHole_29404622_Data_29377140_29377142_20230429_165139_EER_patch_aligned_doseweighted_particles.mrc
> Total (asynchronous) processing time: 1.60s
>
> [CPU: 137.1 MB]
> ====== Job process terminated abnormally.
>
> [CPU: 1.57 GB]
> -- 1.0: (3486 of 17355) processing J150/motioncorrected/001529413231024034262_FoilHole_29404622_Data_29377152_29377154_20230429_165127_EER_patch_aligned_doseweighted.mrc
> 1 particles extracted (1 rejected near edges)
> Writing to J208/extract/001529413231024034262_FoilHole_29404622_Data_29377152_29377154_20230429_165127_EER_patch_aligned_doseweighted_particles.mrc
> Total (asynchronous) processing time: 1.63s
>
> [CPU: 1.56 GB]
> -- 0.1: (3487 of 17355) processing J150/motioncorrected/013805053250859322033_FoilHole_29404623_Data_29376983_29376985_20230429_165152_EER_patch_aligned_doseweighted.mrc
>
> [CPU: 1.56 GB]
> -- 0.1: (3488 of 17355) processing J150/motioncorrected/008385003645767961935_FoilHole_29404623_Data_29377140_29377142_20230429_165158_EER_patch_aligned_doseweighted.mrc
Command_core log:
> 2023-07-24 16:23:37,841 COMMAND.MAIN start INFO | === STARTED ===
> 2023-07-24 16:23:37,842 COMMAND.BG_WORKER background_worker INFO | === STARTED ===
> 2023-07-24 16:23:37,843 COMMAND.CORE run INFO | === STARTED TASKS WORKER ===
> * Serving Flask app "command_core" (lazy loading)
> * Environment: production
> WARNING: This is a development server. Do not use it in a production deployment.
> Use a production WSGI server instead.
> * Debug mode: off
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/cryosparc_command/command_core/__init__.py", line 213, in start
> app.run(host="0.0.0.0", port=port, threaded=True, passthrough_errors=False)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/flask/app.py", line 990, in run
> run_simple(host, port, self, **options)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 1052, in run_simple
> inner()
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 996, in inner
> srv = make_server(
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 847, in make_server
> return ThreadedWSGIServer(
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 740, in __init__
> HTTPServer.__init__(self, server_address, handler)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socketserver.py", line 452, in __init__
> self.server_bind()
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/server.py", line 138, in server_bind
> socketserver.TCPServer.server_bind(self)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socketserver.py", line 466, in server_bind
> self.socket.bind(self.server_address)
> OSError: [Errno 98] Address already in use
> 2023-07-24 16:23:40,197 COMMAND.JOBS set_job_status INFO | Status changed for P6.J208 from running to failed
> 2023-07-24 16:23:50,415 COMMAND.MAIN start INFO | === STARTED ===
> 2023-07-24 16:23:50,416 COMMAND.BG_WORKER background_worker INFO | === STARTED ===
> 2023-07-24 16:23:50,416 COMMAND.CORE run INFO | === STARTED TASKS WORKER ===
> * Serving Flask app "command_core" (lazy loading)
> * Environment: production
> WARNING: This is a development server. Do not use it in a production deployment.
> Use a production WSGI server instead.
> * Debug mode: off
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/cryosparc_command/command_core/__init__.py", line 213, in start
> app.run(host="0.0.0.0", port=port, threaded=True, passthrough_errors=False)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/flask/app.py", line 990, in run
> run_simple(host, port, self, **options)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 1052, in run_simple
> inner()
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 996, in inner
> srv = make_server(
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 847, in make_server
> return ThreadedWSGIServer(
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 740, in __init__
> HTTPServer.__init__(self, server_address, handler)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socketserver.py", line 452, in __init__
> self.server_bind()
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/server.py", line 138, in server_bind
> socketserver.TCPServer.server_bind(self)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socketserver.py", line 466, in server_bind
> self.socket.bind(self.server_address)
> OSError: [Errno 98] Address already in use
> 2023-07-24 16:23:57,873 COMMAND.MAIN start INFO | === STARTED ===
> 2023-07-24 16:23:57,874 COMMAND.BG_WORKER background_worker INFO | === STARTED ===
> 2023-07-24 16:23:57,874 COMMAND.CORE run INFO | === STARTED TASKS WORKER ===
> * Serving Flask app "command_core" (lazy loading)
> * Environment: production
> WARNING: This is a development server. Do not use it in a production deployment.
> Use a production WSGI server instead.
> * Debug mode: off
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/cryosparc_command/command_core/__init__.py", line 213, in start
> app.run(host="0.0.0.0", port=port, threaded=True, passthrough_errors=False)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/flask/app.py", line 990, in run
> run_simple(host, port, self, **options)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 1052, in run_simple
> inner()
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 996, in inner
> srv = make_server(
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 847, in make_server
> return ThreadedWSGIServer(
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/werkzeug/serving.py", line 740, in __init__
> HTTPServer.__init__(self, server_address, handler)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socketserver.py", line 452, in __init__
> self.server_bind()
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/server.py", line 138, in server_bind
> socketserver.TCPServer.server_bind(self)
> File "/projappl/project_2006450/usrappl/kyrybisi/cryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socketserver.py", line 466, in server_bind
> self.socket.bind(self.server_address)
As for the other potential causes that you’ve mentioned:
- Storage capacity and RAM shouldn’t be an issue - this is running on a cluster, and I am granted as much as possible
- The job fails on different micrographs
- As previously stated, the same job, but on the other part of the same dataset went okay, and I am using the same worker, so that should rule out the worker failure?
Thank you for taking the time to help me solve this!!