Hi Suhail,
we have done some further investigating. When starting cryosparc (cryosparcm start, does a normal start up, no errors or warnings), the gui is responsive for about 1 minute. During this time it is possible to start jobs etc. After this 1 Minute, it is not possible to start new jobs but one can still browse old projects and check resources etc. If a job builder is open while this happens, it is no longer possible to browse the file system (e.g. when browsing for a file to import).
If you start a job before this ~1min interval is up, it will continue running on the respective node. The results will be visible during the entire time in the web gui but the job dies during the final step (Exporting job and creating csg-files). After cryosparc stops being responsive, there is also no further output to the command_core logs.
If I do a fresh start on cryosparc and log the command core, do nothing on the web gui and wait for a minute, I get the following:
[cryosparc@be-cryosparc ~]$ cryosparcm start
Starting cryoSPARC System master process..
CryoSPARC is not already running.
database: started
command_core: started
command_core connection succeeded
command_vis: started
command_rtp: started
command_rtp connection succeeded
webapp: started
app: started
liveapp: started
-----------------------------------------------------
CryoSPARC master started.
From this machine, access cryoSPARC at
http:/ /localhost:39000
and access cryoSPARC Live at
http:/ /localhost:39006
please note the legacy cryoSPARC Live application is running at
http:/ /localhost:39007
From other machines on the network, access cryoSPARC at
http:/ /be-cryosparc:39000
and access cryoSPARC Live at
http:/ /be-cryosparc:39006
Startup can take several minutes. Point your browser to the address
and refresh until you see the cryoSPARC web interface.
And for the command_core log:
cryosparcm log command_core
COMMAND CORE STARTED === 2021-02-09 14:18:17.186763 ==========================
*** BG WORKER START
* Serving Flask app "command_core" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
COMMAND CORE STARTED === 2021-02-09 15:32:23.458067 ==========================
*** BG WORKER START
* Serving Flask app "command_core" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
Also, after this one minute I cannot use cryosparcm cli e.g. to remove a node. The below command does not complete and gives an error message when the command is killed with ctrl + c…
[cryosparc@be-cryosparc cryosparc_master]$ cryosparcm cli “remove_scheduler_target_node(‘gpu08’)”
^C*** client.py: command (http:/ /be-cryosparc:39002/api) did not reply within timeout of 300 seconds, attempt 1 of 3
^C*** client.py: command (http:/ /be-cryosparc:39002/api) did not reply within timeout of 300 seconds, attempt 2 of 3
^C*** client.py: command (http:/ /be-cryosparc:39002/api) did not reply within timeout of 300 seconds, attempt 3 of 3
Traceback (most recent call last):
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/runpy.py”, line 193, in _run_module_as_main
“main”, mod_spec)
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/runpy.py”, line 85, in _run_code
exec(code, run_globals)
File “/opt/cryosparc/cryosparc_master/cryosparc_compute/client.py”, line 85, in
cli = CommandClient(host, int(port))
File “/opt/cryosparc/cryosparc_master/cryosparc_compute/client.py”, line 35, in init
self._reload()
File “/opt/cryosparc/cryosparc_master/cryosparc_compute/client.py”, line 63, in _reload
system = self._get_callable(‘system.describe’)()
File “/opt/cryosparc/cryosparc_master/cryosparc_compute/client.py”, line 51, in func
r = requests.post(self.url, data = json.dumps(data, cls=NumpyEncoder), headers = header, timeout=self.timeout)
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/requests/api.py”, line 119, in post
return request(‘post’, url, data=data, json=json, **kwargs)
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/requests/api.py”, line 61, in request
return session.request(method=method, url=url, **kwargs)
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/requests/sessions.py”, line 530, in request
resp = self.send(prep, **send_kwargs)
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/requests/sessions.py”, line 643, in send
r = adapter.send(request, **kwargs)
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/requests/adapters.py”, line 449, in send
timeout=timeout
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/urllib3/connectionpool.py”, line 677, in urlopen
chunked=chunked,
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/urllib3/connectionpool.py”, line 426, in _make_request
six.raise_from(e, None)
File “”, line 3, in raise_from
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/site-packages/urllib3/connectionpool.py”, line 421, in _make_request
httplib_response = conn.getresponse()
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/http/client.py”, line 1354, in getresponse
response.begin()
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/http/client.py”, line 306, in begin
version, status, reason = self._read_status()
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/http/client.py”, line 267, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), “iso-8859-1”)
File “/opt/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/socket.py”, line 589, in readinto
return self._sock.recv_into(b)
We also tried a clean install on the same system, that seems to work fine. If we then import the database dumped from the previous installation, the same problem occurs.
Thanks for any help, we can provide more logs etc if needed!
Lukas