Failed project/live session seems to freeze the instance

Our CryoSPARC instance is unresponsive. It is possible to navigate the projects and jobs, but creating new jobs or edit them does not work or takes very long time. The problem is a bit similar to this thread: https://discuss.cryosparc.com/t/webgui-partly-unresponsive-freezing-cryosparc/9110/3

In the logs, it seems to be related to a project which only had a failed live session. The project, P193, does not show up in the “All projects” (shoebox) listing, but a live session S1 shows up in the live session listing (lightning bolt). The project folder is still there and has two folders S1 and S3.

Upon restart, the contents of command_rtp shows that migration of live sessions proceeds until it reaches this problematic project. See below for selected parts of the log. Errors about background worker “socket.timeout: timed out” keeps repeating.

I think that purging this failed live session (or the entire project) from the data base might fix the problem. How can I do that?

The command_core log indicate some other projects which might be problematic, but I suspect that is due to failed jobs when the server was shut down (killed the supervisor process). See below for selected parts of that log.

Regards,
Daniel

From command_rtp.log:

2023-06-21 10:00:20,908 RTP.MAIN             start                INFO     |  === STARTED === 
2023-06-21 10:00:20,909 RTP.BG_WORKER        background_worker    INFO     |  === STARTED === 
 * Serving Flask app "command_rtp" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
2023-06-21 10:01:21,871 RTP.MAIN             migrate_old_sessions_run INFO     | Finished migrating P21 S3 in 0.00s
2023-06-21 10:01:21,871 RTP.MAIN             migrate_old_sessions_run INFO     | Finished migrating P22 S3 in 0.00s
2023-06-21 10:01:21,871 RTP.MAIN             migrate_old_sessions_run INFO     | Finished migrating P23 S3 in 0.00s

*snip*

2023-06-21 10:01:28,325 RTP.MAIN             migrate_old_sessions_run INFO     | Finished migrating P192 S3 in 0.02s
2023-06-21 10:01:28,327 RTP.MAIN             create_live_session_job INFO     | Creating Live Session job for P193 S1
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    | RTP Child Monitor Failed
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    | Traceback (most recent call last):
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_rtp/__init__.py", line 113, in background_worker
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     rtp_child_job_monitor()
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 191, in wrapper
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     return func(*args, **kwargs)
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_rtp/__init__.py", line 2800, in rtp_child_job_monitor
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     new_status = cli.get_job_status(session['project_uid'], job['uid'])
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 104, in func
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     with make_json_request(self, "/api", data=data) as request:
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/contextlib.py", line 113, in __enter__
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     return next(self.gen)
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 165, in make_request
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     with urlopen(request, timeout=client._timeout) as response:
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 222, in urlopen
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     return opener.open(url, data, timeout)
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 525, in open
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     response = self._open(req, data)
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 542, in _open
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     result = self._call_chain(self.handle_open, protocol, protocol +
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 502, in _call_chain
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     result = func(*args)
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1383, in http_open
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     return self.do_open(http.client.HTTPConnection, req)
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1358, in do_open
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     r = h.getresponse()
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 1348, in getresponse
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     response.begin()
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 316, in begin
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     version, status, reason = self._read_status()
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 277, in _read_status
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socket.py", line 669, in readinto
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    |     return self._sock.recv_into(b)
2023-06-21 10:05:26,024 RTP.BG_WORKER        background_worker    ERROR    | socket.timeout: timed out
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    | POST-RESPONSE-THREAD ERROR at dump_all_live_sessions_run
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    | Traceback (most recent call last):
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 78, in run
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     self.target(*self.args)
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_rtp/__init__.py", line 398, in dump_all_live_sessions_run
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     all_projects = cli.list_projects()
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 104, in func
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     with make_json_request(self, "/api", data=data) as request:
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/contextlib.py", line 113, in __enter__
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     return next(self.gen)
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 165, in make_request
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     with urlopen(request, timeout=client._timeout) as response:
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 222, in urlopen
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     return opener.open(url, data, timeout)
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 525, in open
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     response = self._open(req, data)
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 542, in _open
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     result = self._call_chain(self.handle_open, protocol, protocol +
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 502, in _call_chain
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     result = func(*args)
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1383, in http_open
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     return self.do_open(http.client.HTTPConnection, req)
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1358, in do_open
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     r = h.getresponse()
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 1348, in getresponse
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     response.begin()
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 316, in begin
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     version, status, reason = self._read_status()
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 277, in _read_status
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socket.py", line 669, in readinto
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    |     return self._sock.recv_into(b)
2023-06-21 10:06:21,961 COMMAND.COMMON       run                  ERROR    | socket.timeout: timed out

From command_core.log:

2023-06-21 10:00:08,073 COMMAND.MAIN         start                INFO     |  === STARTED === 
2023-06-21 10:00:08,073 COMMAND.BG_WORKER    background_worker    INFO     |  === STARTED === 
2023-06-21 10:00:08,073 COMMAND.CORE         run                  INFO     | === STARTED TASKS WORKER ===
 * Serving Flask app "command_core" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
2023-06-21 10:00:08,681 COMMAND.MAIN         startup              INFO     | Starting CryoSPARC v4.2.1+230427
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   platform_node : donatello
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   platform_release : 3.10.0-1160.el7.x86_64
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   platform_version : #1 SMP Mon Oct 19 16:18:59 UTC 2020
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   platform_architecture : x86_64
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   physical_cores : 24
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   max_cpu_freq : 3600.0
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   total_memory : 503.35GB
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   available_memory : 480.98GB
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   used_memory : 20.71GB
2023-06-21 10:00:08,682 COMMAND.MAIN         startup              INFO     |   version : v4.2.1+230427
2023-06-21 10:00:10,191 COMMAND.STARTUP      startup              INFO     | CryoSPARC instance ID: 7e154680-a3bc-4ea8-a8ef-c89b08459db3
2023-06-21 10:00:10,191 COMMAND.SCHEDULER    get_gpu_info         INFO     | UPDATING WORKER GPU INFO
2023-06-21 10:00:10,191 COMMAND.JOBS         update_all_job_sizes INFO     | UPDATING ALL JOB SIZES IN 10s
2023-06-21 10:00:10,191 COMMAND.DATA         export_all_projects  INFO     | EXPORTING ALL PROJECTS IN 60s...
2023-06-21 10:00:14,089 COMMAND.HEARTBEAT    check_heartbeats     WARNING  | Marking P195 J85 as failed
2023-06-21 10:00:14,090 COMMAND.JOBS         set_job_status       INFO     | Status changed for P195.J85 from running to failed
2023-06-21 10:00:14,159 COMMAND.JOBS         app_stats_refresh    WARNING  | Failed to call stats refresh endpoint for P195 J85: HTTPConnectionPool(host='donatello', port=29440): Max retries exceeded with url: /api/actions/stats/refresh_job (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcd7d5c3340>: Failed to establish a new connection: [Errno 111] Connection refused'))
2023-06-21 10:00:14,194 COMMAND.HEARTBEAT    check_heartbeats     WARNING  | Marking P200 J37 as failed
2023-06-21 10:00:14,195 COMMAND.JOBS         set_job_status       INFO     | Status changed for P200.J37 from waiting to failed
2023-06-21 10:00:14,198 COMMAND.JOBS         app_stats_refresh    WARNING  | Failed to call stats refresh endpoint for P200 J37: HTTPConnectionPool(host='donatello', port=29440): Max retries exceeded with url: /api/actions/stats/refresh_job (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcd7d5fc2e0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2023-06-21 10:00:14,200 COMMAND.HEARTBEAT    check_heartbeats     WARNING  | Marking P199 J105 as failed
2023-06-21 10:00:14,201 COMMAND.JOBS         set_job_status       INFO     | Status changed for P199.J105 from waiting to failed
2023-06-21 10:00:14,204 COMMAND.JOBS         app_stats_refresh    WARNING  | Failed to call stats refresh endpoint for P199 J105: HTTPConnectionPool(host='donatello', port=29440): Max retries exceeded with url: /api/actions/stats/refresh_job (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcd7d388160>: Failed to establish a new connection: [Errno 111] Connection refused'))
2023-06-21 10:00:14,206 COMMAND.HEARTBEAT    check_heartbeats     WARNING  | Marking P199 J110 as failed
2023-06-21 10:00:14,207 COMMAND.JOBS         set_job_status       INFO     | Status changed for P199.J110 from waiting to failed
2023-06-21 10:00:14,210 COMMAND.JOBS         app_stats_refresh    WARNING  | Failed to call stats refresh endpoint for P199 J110: HTTPConnectionPool(host='donatello', port=29440): Max retries exceeded with url: /api/actions/stats/refresh_job (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcd7d2b37f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2023-06-21 10:00:14,211 COMMAND.HEARTBEAT    check_heartbeats     WARNING  | Marking P194 J151 as failed
2023-06-21 10:00:14,212 COMMAND.JOBS         set_job_status       INFO     | Status changed for P194.J151 from waiting to failed
2023-06-21 10:00:14,215 COMMAND.JOBS         app_stats_refresh    WARNING  | Failed to call stats refresh endpoint for P194 J151: HTTPConnectionPool(host='donatello', port=29440): Max retries exceeded with url: /api/actions/stats/refresh_job (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcd7d391130>: Failed to establish a new connection: [Errno 111] Connection refused'))

I managed to delete the P193 S1 live session using the GUI, but that did not seems to have helped. That was maybe a red herring. The GUI is still very unresponsive and trying to edit a job yields socket timeout errors. Perhaps a network problem?

Some more errors after restarting again. This time cursor error, which seems to be related to the database.

From command_core.log:

2023-06-21 11:57:53,531 COMMAND.MAIN         start                INFO     |  === STARTED === 
2023-06-21 11:57:53,532 COMMAND.BG_WORKER    background_worker    INFO     |  === STARTED === 
2023-06-21 11:57:53,532 COMMAND.CORE         run                  INFO     | === STARTED TASKS WORKER ===
 * Serving Flask app "command_core" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     | Starting CryoSPARC v4.2.1+230427
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   platform_node : donatello
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   platform_release : 3.10.0-1160.el7.x86_64
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   platform_version : #1 SMP Mon Oct 19 16:18:59 UTC 2020
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   platform_architecture : x86_64
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   physical_cores : 24
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   max_cpu_freq : 3600.0
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   total_memory : 503.35GB
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   available_memory : 480.96GB
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   used_memory : 20.74GB
2023-06-21 11:57:54,993 COMMAND.MAIN         startup              INFO     |   version : v4.2.1+230427
2023-06-21 11:57:56,421 COMMAND.STARTUP      startup              INFO     | CryoSPARC instance ID: 7e154680-a3bc-4ea8-a8ef-c89b08459db3
2023-06-21 11:57:56,421 COMMAND.SCHEDULER    get_gpu_info         INFO     | UPDATING WORKER GPU INFO
2023-06-21 11:57:56,421 COMMAND.JOBS         update_all_job_sizes INFO     | UPDATING ALL JOB SIZES IN 10s
2023-06-21 11:57:56,421 COMMAND.DATA         export_all_projects  INFO     | EXPORTING ALL PROJECTS IN 60s...
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    | POST-RESPONSE-THREAD ERROR at export_all_projects_run
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    | Traceback (most recent call last):
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 78, in run
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     self.target(*self.args)
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_core/__init__.py", line 3660, in export_all_projects_run
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     export_projects = list(mongo.db['projects'].find({ 'deleted': False, 'archived': False, 'detached': False }, {'uid':1}))
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1280, in next
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     if len(self.__data) or self._refresh():
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1217, in _refresh
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     self.__send_message(g)
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1078, in __send_message
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     response = client._run_operation(
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1515, in _run_operation
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     return self._retryable_read(
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1617, in _retryable_read
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     return func(session, server, sock_info, secondary_ok)
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1511, in _cmd
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     return server.run_operation(
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/server.py", line 133, in run_operation
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     _check_command_response(first, sock_info.max_wire_version)
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |   File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/helpers.py", line 178, in _check_command_response
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    |     raise CursorNotFound(errmsg, code, response, max_wire_version)
2023-06-21 12:52:11,417 COMMAND.COMMON       run                  ERROR    | pymongo.errors.CursorNotFound: cursor id 120887245035 not found, full error: {'ok': 0.0, 'errmsg': 'cursor id 120887245035 not found', 'code': 43, 'codeName': 'CursorNotFound', 'operationTime': Timestamp(1687343107, 1), '$clusterTime': {'clusterTime': Timestamp(1687343107, 1), 'signature': {'hash': b"|\x82;%\x9b\xc3\xf09T\xdf$\x12p\xde\x1c\\\xb1'\x18k", 'keyId': 7193633757633445889}}}
**custom thread exception hook caught something
**** handle exception rc
Traceback (most recent call last):
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_compute/jobs/runcommon.py", line 2061, in run_with_except_hook
    run_old(*args, **kw)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 78, in run
    self.target(*self.args)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_core/__init__.py", line 3660, in export_all_projects_run
    export_projects = list(mongo.db['projects'].find({ 'deleted': False, 'archived': False, 'detached': False }, {'uid':1}))
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1280, in next
    if len(self.__data) or self._refresh():
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1217, in _refresh
    self.__send_message(g)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1078, in __send_message
    response = client._run_operation(
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1515, in _run_operation
    return self._retryable_read(
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1617, in _retryable_read
    return func(session, server, sock_info, secondary_ok)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1511, in _cmd
    return server.run_operation(
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/server.py", line 133, in run_operation
    _check_command_response(first, sock_info.max_wire_version)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/helpers.py", line 178, in _check_command_response
    raise CursorNotFound(errmsg, code, response, max_wire_version)
pymongo.errors.CursorNotFound: cursor id 120887245035 not found, full error: {'ok': 0.0, 'errmsg': 'cursor id 120887245035 not found', 'code': 43, 'codeName': 'CursorNotFound', 'operationTime': Timestamp(1687343107, 1), '$clusterTime': {'clusterTime': Timestamp(1687343107, 1), 'signature': {'hash': b"|\x82;%\x9b\xc3\xf09T\xdf$\x12p\xde\x1c\\\xb1'\x18k", 'keyId': 7193633757633445889}}}

Traceback (most recent call last):
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_compute/jobs/runcommon.py", line 2061, in run_with_except_hook
    run_old(*args, **kw)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 78, in run
    self.target(*self.args)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_core/__init__.py", line 3660, in export_all_projects_run
    export_projects = list(mongo.db['projects'].find({ 'deleted': False, 'archived': False, 'detached': False }, {'uid':1}))
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1280, in next
    if len(self.__data) or self._refresh():
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1217, in _refresh
    self.__send_message(g)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/cursor.py", line 1078, in __send_message
    response = client._run_operation(
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1515, in _run_operation
    return self._retryable_read(
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1617, in _retryable_read
    return func(session, server, sock_info, secondary_ok)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1511, in _cmd
    return server.run_operation(
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/server.py", line 133, in run_operation
    _check_command_response(first, sock_info.max_wire_version)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/helpers.py", line 178, in _check_command_response
    raise CursorNotFound(errmsg, code, response, max_wire_version)
pymongo.errors.CursorNotFound: cursor id 120887245035 not found, full error: {'ok': 0.0, 'errmsg': 'cursor id 120887245035 not found', 'code': 43, 'codeName': 'CursorNotFound', 'operationTime': Timestamp(1687343107, 1), '$clusterTime': {'clusterTime': Timestamp(1687343107, 1), 'signature': {'hash': b"|\x82;%\x9b\xc3\xf09T\xdf$\x12p\xde\x1c\\\xb1'\x18k", 'keyId': 7193633757633445889}}}

Please can you inspect

/home/cryosparcuser/cryosparc/cryosparc_master/run/database.log

for potential underlying database errors?
Does

  1. the storage volume holding the $CRYOSPARC_DB_PATH directory have free space?
  2. the computer that runs cryosparc_master processes have sufficient available RAM?

The volumes have sufficient free space for the DB and plenty of RAM (500 GB).

I checked the database.log and there no obvious errors.

The earliest problem I can find is from yesterday evening at 6.33 pm

In the command_core.log:

2023-06-20 18:33:56,026 COMMAND.JOBS         app_stats_refresh    WARNING  | Failed to call stats refresh endpoint for P203 J89: HTTPConnectionPool(host='donatello', port=29440): Read timed out. (read timeout=2)
2023-06-20 18:38:05,142 COMMAND.JOBS         set_job_status       INFO     | Status changed for P187.J60 from running to failed
2023-06-20 18:38:05,145 COMMAND.JOBS         app_stats_refresh    WARNING  | Failed to call stats refresh endpoint for P187 J60: HTTPConnectionPool(host='donatello', port=29440): Max retries exceeded with url: /api/actions/stats/refresh_job (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc62a62cd30>: Failed to establish a new connection: [Errno 111] Connection refused'))
2023-06-20 18:46:17,823 COMMAND.BG_WORKER    background_worker    INFO     | License does not have telemetry enabled; will re-check license in 1 hour.
2023-06-20 19:46:17,920 COMMAND.BG_WORKER    background_worker    INFO     | License does not have telemetry enabled; will re-check license in 1 hour.
2023-06-20 20:46:17,953 COMMAND.BG_WORKER    background_worker    INFO     | License does not have telemetry enabled; will re-check license in 1 hour.
2023-06-20 21:46:17,998 COMMAND.BG_WORKER    background_worker    INFO     | License does not have telemetry enabled; will re-check license in 1 hour.
2023-06-20 22:46:18,000 COMMAND.BG_WORKER    background_worker    INFO     | License does not have telemetry enabled; will re-check license in 1 hour.

In supervisord.log:

2023-06-20 18:35:57,108 INFO exited: app (exit status 1; not expected)

In database.log:

2023-06-20T18:33:35.890+0200 I NETWORK  [conn6990] end connection 192.168.177.19:57116 (285 connections now open)
2023-06-20T18:33:35.890+0200 I NETWORK  [conn6991] end connection 192.168.177.19:57118 (284 connections now open)
2023-06-20T18:33:35.893+0200 I NETWORK  [conn6992] end connection 192.168.177.19:57120 (283 connections now open)
2023-06-20T18:33:56.207+0200 I NETWORK  [conn6988] end connection 192.168.177.19:57088 (282 connections now open)
2023-06-20T18:33:56.207+0200 I NETWORK  [conn6989] end connection 192.168.177.19:57090 (281 connections now open)
2023-06-20T18:33:56.760+0200 I NETWORK  [conn6987] end connection 192.168.177.19:57084 (280 connections now open)
2023-06-20T18:33:56.760+0200 I NETWORK  [conn6986] end connection 192.168.177.19:57082 (279 connections now open)
2023-06-20T18:35:57.092+0200 I NETWORK  [conn8] end connection 127.0.0.1:33514 (278 connections now open)
2023-06-20T18:35:57.092+0200 I NETWORK  [conn9] end connection 127.0.0.1:33516 (277 connections now open)
2023-06-20T18:35:57.092+0200 I NETWORK  [conn10] end connection 127.0.0.1:33518 (276 connections now open)
2023-06-20T18:35:57.092+0200 I NETWORK  [conn30] end connection 127.0.0.1:33706 (273 connections now open)
2023-06-20T18:35:57.092+0200 I NETWORK  [conn28] end connection 127.0.0.1:33702 (274 connections now open)
[ and so on ]

In command_vis.log there were lots of tracebacks sometime after 5:35 pm:

2023-06-20 17:35:13,349 VIS.MAIN             recreate_mesh        INFO     | Volume path: /path/to/project/J116/J116_class_00_final_volume.mrc
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_vis/__init__.py", line 49, in <module>
    from .viscommon import check_project_file, check_project_path, check_project_dir, cli, extern_raw, rtp, vis_main_log
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_vis/viscommon.py", line 20, in <module>
    rc.connect(os.environ['CRYOSPARC_MASTER_HOSTNAME'], int(os.environ['CRYOSPARC_COMMAND_CORE_PORT']), usedb=mongo.db, usegridfs=gridfs)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_compute/jobs/runcommon.py", line 126, in connect
    cli = client.CommandClient(master_hostname, int(master_command_core_port), service="command_core")
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_compute/client.py", line 36, in __init__
    super().__init__(service, host, port, url, timeout, headers, cls=NumpyEncoder)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 91, in __init__
    self._reload()  # attempt connection immediately to gather methods
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 118, in _reload
    system = self._get_callable("system.describe")()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 104, in func
    with make_json_request(self, "/api", data=data) as request:
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 165, in make_request
    with urlopen(request, timeout=client._timeout) as response:
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1383, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1358, in do_open
    r = h.getresponse()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 316, in begin
    version, status, reason = self._read_status()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 277, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_vis/__init__.py", line 49, in <module>
    from .viscommon import check_project_file, check_project_path, check_project_dir, cli, extern_raw, rtp, vis_main_log
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_command/command_vis/viscommon.py", line 20, in <module>
    rc.connect(os.environ['CRYOSPARC_MASTER_HOSTNAME'], int(os.environ['CRYOSPARC_COMMAND_CORE_PORT']), usedb=mongo.db, usegridfs=gridfs)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_compute/jobs/runcommon.py", line 126, in connect
    cli = client.CommandClient(master_hostname, int(master_command_core_port), service="command_core")
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_compute/client.py", line 36, in __init__
    super().__init__(service, host, port, url, timeout, headers, cls=NumpyEncoder)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 91, in __init__
    self._reload()  # attempt connection immediately to gather methods
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 118, in _reload
    system = self._get_callable("system.describe")()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 104, in func
    with make_json_request(self, "/api", data=data) as request:
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 165, in make_request
    with urlopen(request, timeout=client._timeout) as response:
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1383, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1358, in do_open
    r = h.getresponse()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 316, in begin
    version, status, reason = self._read_status()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 277, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

Nothing in command_rtp.log from that time.

Assuming your CryoSPARC version is 4.2.1, please can you apply patch 230621 and see if the patch addresses the problem.

Yes, this is v. 4.2.1+230427.

I tried to download the patch, but got similar errors in the shell as in the logs when I tried to run the cryosparcm patch --check command. It seems like there is some network issue here at the university.

> cryosparcm patch --check
Traceback (most recent call last):
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/patch.py", line 34, in <module>
    CLI = client.CommandClient(
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_compute/client.py", line 36, in __init__
    super().__init__(service, host, port, url, timeout, headers, cls=NumpyEncoder)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 91, in __init__
    self._reload()  # attempt connection immediately to gather methods
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 118, in _reload
    system = self._get_callable("system.describe")()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 104, in func
    with make_json_request(self, "/api", data=data) as request:
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py", line 165, in make_request
    with urlopen(request, timeout=client._timeout) as response:
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1383, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/urllib/request.py", line 1358, in do_open
    r = h.getresponse()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 316, in begin
    version, status, reason = self._read_status()
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/http/client.py", line 277, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/home/cryosparcuser/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

We found the problem! One of the file-servers was offline and caused the time-outs.

1 Like