I have problem with an installation on a cluster, which I upgraded to 3.3.1 in the beginning of January. The last few days, when I try to do “cryosparcm restart” the process stalls after a while. The symptoms are similar to this issue: https://discuss.cryosparc.com/t/webapp-not-starting-after-v3-0-1-update/5721
Here are the symptoms and my tests:
The start up sequence stalls after the message “command_rtp: started”.
If I cancel using Ctrl-c and run “cryosparmc status”, it shows that the database, command_core, command_rtp and command_vis are running, but not app, app_dev, liveapp, liveapp_dev, webapp and webapp_dev.
Also, the “cryosparcm status” command stalls before showing the information about variables etc.
I reinstalled the cryosparc_master and cryosparc_worker. The symtoms are still there. I cannot patch the installation (the command “cryosparcm patch” also stalls)
I am able to start the app, liveapp and webapp by running “cryosparcm start app” etc. But not the *_dev apps, they show “app_dev: ERROR (no such file)” etc. See log outputs from these apps below.
We had problems with the DNS server recently, and got an error about not being able to connect to get.cryosparc.com, Now that works, but could this be some lingering network issue?
One indication that there is a network problem is that when I run with “export DEBUG=true” in cryosparc_master/config.sh I get these error messages:
Attempt 1/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb8b218ea50>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 2/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb8b21267d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 3/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb8b215ee50>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed to GET http://a001:29445
cryosparcm log app
(node:20794) Warning: Accessing non-existent property 'count' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
(node:20794) Warning: Accessing non-existent property 'findOne' of module exports inside circular dependency
(node:20794) Warning: Accessing non-existent property 'remove' of module exports inside circular dependency
(node:20794) Warning: Accessing non-existent property 'updateOne' of module exports inside circular dependency
cryosparcm log liveapp
Ready to serve GridFS files
cryoSPARC v2 Application Server Started
cryosparcm log webapp
ESC[34mcryoSPARCESC[39m
(node:20610) DeprecationWarning: current Server Discovery and Monitoring engine is deprecated, and will be removed in a future version. To use the new Server Discover and Monitoring engine, pass option { useUnifiedTopology: true } to the MongoClient constructor.
ESC[32mReady to serve GridFSESC[39m
==== [projects] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [workspace] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [jobs] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [projects] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [workspace] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [jobs] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [projects] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [workspace] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [projects] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [workspace] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [jobs] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [projects] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [workspace] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [jobs] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [projects] project query user 616eacf9742a11a07af29c82 Leonardo false
set_user_viewed_project
["616eacf9742a11a07af29c82","P27"]
==== [projects] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [workspace] project query user 616eacf9742a11a07af29c82 Leonardo false
==== [projects] project query user 6043f5bb590be8eaa57e4932 larsson true
==== [workspace] project query user 6043f5bb590be8eaa57e4932 larsson true
==== [jobs] project query user 6043f5bb590be8eaa57e4932 larsson true
==== [projects] project query user 6043f5bb590be8eaa57e4932 larsson true
==== [projects] project query user 6043f5bb590be8eaa57e4932 larsson true
I can also add that I get additional error messages for port 29445. These goes on “forever”. Here is a full transcript:
cryosparcuser@a001:~/cryosparc/cryosparc_master cryosparcm restart
CryoSPARC is not already running.
If you would like to restart, use cryosparcm restart
Starting cryoSPARC System master process..
CryoSPARC is not already running.
database: started
Database configuration is OK.
command_core: started
Attempt 1/3 to GET http://a001:29442 failed with exception: HTTPConnectionPool(host='a001', port=29442): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcc8c3edbd0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 2/3 to GET http://a001:29442 failed with exception: HTTPConnectionPool(host='a001', port=29442): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcc8c388890>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 3/3 to GET http://a001:29442 failed with exception: HTTPConnectionPool(host='a001', port=29442): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcc8c3bff90>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed to GET http://a001:29442
command_core connection succeeded
command_core startup successful
command_vis: started
command_rtp: started
Attempt 1/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb9e6059a90>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 2/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb9e5ff3850>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 3/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb9e602ae90>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed to GET http://a001:29445
Attempt 1/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff152dc5b90>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 2/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff152d5f910>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 3/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff152f12f10>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed to GET http://a001:29445
Attempt 1/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9cc57eeb10>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 2/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9cc5788890>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 3/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9cc57bfe90>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed to GET http://a001:29445
Attempt 1/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff54ea189d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 2/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff54e9b2750>: Failed to establish a new connection: [Errno 111] Connection refused'))
Retrying...
Attempt 3/3 to GET http://a001:29445 failed with exception: HTTPConnectionPool(host='a001', port=29445): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff54e9e9dd0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed to GET http://a001:29445
<snip>
@daniel.s.d.larsson With cryoSPARC “running” (to the extent it can run right now on your machine), does any process show up with this command (on the cryoSPARC master): netstat -ap | grep 29445
?
Not that I can see (I don’t have root on this machine).
cryosparcuser@a001:~ netstat -ap | grep 29445
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)