Hello,
I have been using cryoSPARC v3.2 on an HPC where I have to start the master instance each time I re-connect to the cluster. This has been working fine until I was disconnected from the node which was running the master instance and cryoSPARC crashed and I am unable to restart cryoSPARC. The startup process seems to be stuck after “command_core: started”. Here is the log from cryosparcm log command_core
- COMMAND CORE STARTED === 2021-11-13 01:48:07.015774 ==========================
*** BG WORKER START- Serving Flask app “command_core” (lazy loading)
- Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.- Debug mode: off
HTTPSConnectionPool(host=‘get.cryosparc.com’, port=443): Max retries exceeded with url: /heartbeat/ (Caused by ProxyError(‘Cannot connect to proxy.’, NewConnectionError(’<urllib3.connection.HTTPSConnection object at 0x148d72a64c10>: Failed to establish a new connection: [Errno -2] Name or service not known’)))
Error connecting to cryoSPARC license server during instance heartbeat.
HTTPSConnectionPool(host=‘get.cryosparc.com’, port=443): Max retries exceeded with url: /heartbeat/ (Caused by ProxyError(‘Cannot connect to proxy.’, NewConnectionError(’<urllib3.connection.HTTPSConnection object at 0x148d72a64190>: Failed to establish a new connection: [Errno -2] Name or service not known’)))
Error connecting to cryoSPARC license server during instance heartbeat.
HTTPSConnectionPool(host=‘get.cryosparc.com’, port=443): Max retries exceeded with url: /heartbeat/ (Caused by ProxyError(‘Cannot connect to proxy.’, NewConnectionError(’<urllib3.connection.HTTPSConnection object at 0x148d72a64f90>: Failed to establish a new connection: [Errno -2] Name or service not known’)))
Error connecting to cryoSPARC license server during instance heartbeat.
HTTPSConnectionPool(host=‘get.cryosparc.com’, port=443): Max retries exceeded with url: /heartbeat/ (Caused by ProxyError(‘Cannot connect to proxy.’, NewConnectionError(’<urllib3.connection.HTTPSConnection object at 0x148d72a64fd0>: Failed to establish a new connection: [Errno -2] Name or service not known’)))
Error connecting to cryoSPARC license server during instance heartbeat.
HTTPSConnectionPool(host=‘get.cryosparc.com’, port=443): Max retries exceeded with url: /heartbeat/ (Caused by ProxyError(‘Cannot connect to proxy.’, NewConnectionError(’<urllib3.connection.HTTPSConnection object at 0x148d72a641d0>: Failed to establish a new connection: [Errno -2] Name or service not known’)))
Error connecting to cryoSPARC license server during instance heartbeat.
HTTPSConnectionPool(host=‘get.cryosparc.com’, port=443): Max retries exceeded with url: /heartbeat/ (Caused by ProxyError(‘Cannot connect to proxy.’, NewConnectionError(’<urllib3.connection.HTTPSConnection object at 0x148d72a64a90>: Failed to establish a new connection: [Errno -2] Name or service not known’)))
Error connecting to cryoSPARC license server during instance heartbeat.
COMMAND CORE STARTED === 2021-11-13 01:54:09.638931 ==========================
*** BG WORKER START- Serving Flask app “command_core” (lazy loading)
- Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.- Debug mode: off
[JSONRPC ERROR 2021-11-13 01:54:14.647990 at get_config_var ]
**custom thread exception hook caught something
**** handle exception rc
Traceback (most recent call last):
File “/path/to/cryosparc/cryosparc_master/cryosparc_compute/jobs/runcommon.py”, line 1790, in run_with_except_hook
run_old(*args, **kw)
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/threading.py”, line 870, in run
self._target(*self._args, **self._kwargs)
File “/path/to/cryosparc/cryosparc_master/cryosparc_command/command_core/init.py”, line 208, in background_worker
last_audit_date = get_config_var(‘audit’, fail_notset=False, default={})
File “/lpath/to/cryosparc/cryosparc_master/cryosparc_command/command_core/init.py”, line 140, in wrapper
raise e
File “/path/to/cryosparc/cryosparc_master/cryosparc_command/command_core/init.py”, line 131, in wrapper
res = func(*args, **kwargs)
File “/lpath/to/cryosparc/cryosparc_master/cryosparc_command/command_core/init.py”, line 550, in get_config_var
res = mongo.db[colname].find_one({‘name’ : name})
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/collection.py”, line 1319, in find_one
for result in cursor.limit(-1):
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/cursor.py”, line 1207, in next
if len(self.__data) or self._refresh():
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/cursor.py”, line 1124, in _refresh
self.__send_message(q)
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/cursor.py”, line 1001, in __send_message
address=self.__address)
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/mongo_client.py”, line 1372, in _run_operation_with_response
exhaust=exhaust)
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/mongo_client.py”, line 1471, in _retryable_read
return func(session, server, sock_info, slave_ok)
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/mongo_client.py”, line 1366, in _cmd
unpack_res)
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/server.py”, line 137, in run_operation_with_response
first, sock_info.max_wire_version)
File “/path/to/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/pymongo/helpers.py”, line 140, in _check_command_response
raise NotMasterError(errmsg, response)
pymongo.errors.NotMasterError: node is not in primary or recovering state, full error: {‘ok’: 0.0, ‘errmsg’: ‘node is not in primary or recovering state’, ‘code’: 13436, ‘codeName’: ‘NotMasterOrSecondary’}
So far I have checked for orphaned cryosparc processes and killed those. I am able to ping get.cryosparc.com, so it seems the node can connect the license server. I have also re-installed cryoSPARC and the same issue persists. I am not sure what else to try.
Thanks,
Udit