Cannot install CryoSPARC (brand new install) - SEEMS to be something with the database

Billy · February 3, 2023, 1:51pm

I am having one devil of a time trying to get CryoSparc installed as a “Single Workstation master/worker combined”. This is a brand new install of CryoSparc on this system – which is running CentOS Linux release 7.9.2009. I started last week with CryoSparc 4.1.1 having the same problem and have since downloaded 4.1.2.

Running the install.sh in the master,

[cryosparc_user@hostname cryosparc_master]$ ./install.sh --standalone --license $LICENSE_ID --worker_path /home/cryosparc_user/cryosparc/cryosparc_worker --cudapath /usr/local/cuda --ssdpath /scr/cryosparc_cache --initial_email "myemail@emory.edu" --initial_password "password" --initial_username "firstuser" --initial_firstname "firstname" --initial_lastname "lastname"

everything SEEMS to go along per usual – no error messages or warnings. I get to the question of whether to add bin directory to my ~/.bashrc, and after answering that, it attempts to start CryoSparc and this is where it fails, with problems involving the database.

 Starting cryoSPARC...

Starting cryoSPARC System master process..
CryoSPARC is not already running.
configuring database
    creating cryosparc_admin
    cryosparc_admin created
    creating cryosparc_user
    cryosparc_user created
    configuration complete
database: started
Warning: Could not get database status (attempt 1/3)
Warning: Could not get database status (attempt 2/3)
Warning: Could not get database status (attempt 3/3)
checkdb error - could not get replica set status; please reconfigure the database with `cryosparcm configuredb`
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/cryosparc_user/cryosparc/cryosparc_master/cryosparc_compute/database_management.py", line 268, in check_mongo
    admin_db = try_get_pymongo_admin_db(mongo_client)
  File "/home/cryosparc_user/cryosparc/cryosparc_master/cryosparc_compute/database_management.py", line 249, in try_get_pymongo_admin_db
    admin_db.command(({'serverStatus': 1}))
  File "/home/cryosparc_user/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/database.py", line 827, in command
    with self.__client._socket_for_reads(read_preference, session) as (sock_info, secondary_ok):
  File "/home/cryosparc_user/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/home/cryosparc_user/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1478, in _socket_for_reads
    server = self._select_server(read_preference, session)
  File "/home/cryosparc_user/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1436, in _select_server
    server = topology.select_server(server_selector)
  File "/home/cryosparc_user/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/topology.py", line 250, in select_server
    return random.choice(self.select_servers(selector, server_selection_timeout, address))
  File "/home/cryosparc_user/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/topology.py", line 211, in select_servers
    server_descriptions = self._select_servers_loop(selector, server_timeout, address)
  File "/home/cryosparc_user/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.8/site-packages/pymongo/topology.py", line 226, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: hostname.redacted.emory.edu:39001: [Errno 111] Connection refused, Timeout: 20.0s, Topology Description: <TopologyDescription id: 63dd0a5b34233fdbf2175abd, topology_type: Unknown, servers: [<ServerDescription ('hostname.redacted.emory.edu', 39001) server_type: Unknown, rtt: None, error=AutoReconnect('hostname.redacted.emory.edu:39001: [Errno 111] Connection refused')>]>
[2023-02-03T08:22:41-0500] Error checking database. Most recent database log lines:
2023-02-03T08:21:29.172-0500 I REPL     [replexec-0] Starting replication reporter thread
2023-02-03T08:21:29.173-0500 I REPL     [rsSync] transition to SECONDARY from RECOVERING
2023-02-03T08:21:29.173-0500 I REPL     [rsSync] conducting a dry run election to see if we could be elected. current term: 1
2023-02-03T08:21:29.173-0500 I REPL     [replexec-0] dry election run succeeded, running for election in term 2
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] election succeeded, assuming primary role in term 2
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] transition to PRIMARY from SECONDARY
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] Resetting sync source to empty, which was :27017
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] Entering primary catch-up mode.
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] Exited primary catch-up mode.
2023-02-03T08:21:31.177-0500 I REPL     [rsSync] transition to primary complete; database writes are now permitted
[cryosparc_user@hostname cryosparc_master]$

If I then stop CryoSPARC and run the ‘cryosparcm configuredb’ command it says, it seemingly completes, but when I then try to start CryoSPARC, it bombs out with that same “checkdb error - could not get replica set status…” message. PLUS, at this point, I’m not sure what all didn’t get completed during from the install process… So why is the install process bombing out with database problems on a completely new install?

[cryosparc_user@hostname cryosparc_master]$ cryosparcm stop
CryoSPARC is running.
Stopping cryoSPARC
database: stopped
Shut down
[cryosparc_user@hostname cryosparc_master]$ cryosparcm configuredb
configuring database
    configuration complete
[cryosparc_user@hostname cryosparc_master]$

If I look at the database log, it’s not much help (at least to me):

[cryosparc_user@hostname cryosparc_master]$ cryosparcm log database
/cryosparc_database/diagnostic.data'
2023-02-03T08:21:29.150-0500 I REPL     [initandlisten] Rollback ID is 1
2023-02-03T08:21:29.151-0500 I STORAGE  [initandlisten] createCollection: local.replset.oplogTruncateAfterPoint with generated UUID: 29348cb2-abf7-43e1-8918-3f21422cce42
2023-02-03T08:21:29.168-0500 I REPL     [initandlisten] No oplog entries to apply for recovery. appliedThrough and checkpointTimestamp are both null.
2023-02-03T08:21:29.168-0500 I CONTROL  [LogicalSessionCacheRefresh] Sessions collection is not set up; waiting until next sessions refresh interval: Replication has not yet been configured
2023-02-03T08:21:29.168-0500 I NETWORK  [initandlisten] listening via socket bound to 0.0.0.0
2023-02-03T08:21:29.168-0500 I NETWORK  [initandlisten] listening via socket bound to /tmp/mongodb-39001.sock
2023-02-03T08:21:29.168-0500 I NETWORK  [initandlisten] waiting for connections on port 39001
2023-02-03T08:21:29.169-0500 I CONTROL  [LogicalSessionCacheReap] Sessions collection is not set up; waiting until next sessions reap interval: config.system.sessions does not exist
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0]
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] ** WARNING: This replica set node is running without journaling enabled but the
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] **          writeConcernMajorityJournalDefault option to the replica set config
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] **          is set to true. The writeConcernMajorityJournalDefault
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] **          option to the replica set config must be set to false
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] **          or w:majority write concerns will never complete.
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] **          In addition, this node's memory consumption may increase until all
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] **          available free RAM is exhausted.
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0]
2023-02-03T08:21:29.170-0500 I REPL     [replexec-0] New replica set config in use: { _id: "meteor", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "localhost:39001", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: -1, catchUpTakeoverDelayMillis: 30000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('63dd0a55bac92bcf2a44bc85') } }
2023-02-03T08:21:29.171-0500 I REPL     [replexec-0] This node is localhost:39001 in the config
2023-02-03T08:21:29.171-0500 I REPL     [replexec-0] transition to STARTUP2 from STARTUP
2023-02-03T08:21:29.171-0500 I REPL     [replexec-0] Starting replication storage threads
2023-02-03T08:21:29.172-0500 I REPL     [replexec-0] transition to RECOVERING from STARTUP2
2023-02-03T08:21:29.172-0500 I REPL     [replexec-0] Starting replication fetcher thread
2023-02-03T08:21:29.172-0500 I REPL     [replexec-0] Starting replication applier thread
2023-02-03T08:21:29.172-0500 I REPL     [replexec-0] Starting replication reporter thread
2023-02-03T08:21:29.173-0500 I REPL     [rsSync] transition to SECONDARY from RECOVERING
2023-02-03T08:21:29.173-0500 I REPL     [rsSync] conducting a dry run election to see if we could be elected. current term: 1
2023-02-03T08:21:29.173-0500 I REPL     [replexec-0] dry election run succeeded, running for election in term 2
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] election succeeded, assuming primary role in term 2
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] transition to PRIMARY from SECONDARY
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] Resetting sync source to empty, which was :27017
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] Entering primary catch-up mode.
2023-02-03T08:21:29.185-0500 I REPL     [replexec-1] Exited primary catch-up mode.
2023-02-03T08:21:31.177-0500 I REPL     [rsSync] transition to primary complete; database writes are now permitted
2023-02-03T08:26:29.168-0500 I STORAGE  [LogicalSessionCacheRefresh] createCollection: config.system.sessions with generated UUID: 44d1962a-1095-440d-a890-53f5cdc39f0b
2023-02-03T08:26:29.192-0500 I INDEX    [LogicalSessionCacheRefresh] build index on: config.system.sessions properties: { v: 2, key: { lastUse: 1 }, name: "lsidTTLIndex", ns: "config.system.sessions", expireAfterSeconds: 1800 }
2023-02-03T08:26:29.193-0500 I INDEX    [LogicalSessionCacheRefresh]     building index using bulk method; build may temporarily use up to 500 megabytes of RAM
2023-02-03T08:26:29.194-0500 I INDEX    [LogicalSessionCacheRefresh] build index done.  scanned 0 total records. 0 secs

Billy · February 9, 2023, 3:47pm

Anyone???

In trying to troubleshoot a little further, based on this error, about port 39001 and “connection refused”:

pymongo.errors.ServerSelectionTimeoutError: hostname.redacted.emory.edu:39001: [Errno 111] Connection refused, Timeout: 20.0s, Topology Description: <TopologyDescription id: 63e51047ffd3bd03761c2963, topology_type: Unknown, servers: [<ServerDescription ('hostname.redacted.emory.edu', 39001) server_type: Unknown, rtt: None, error=AutoReconnect('hostname.redacted.emory.edu:39001: [Errno 111] Connection refused')>]>

If I point a web browser at hostname.redacted.emory.edu:39001 I get a message “It looks like you are trying to access MongoDB over HTTP on the native driver port.” So SOMETHING is listening there, so not sure why I’m getting this “connection refused” error during the install process (or after the install process fails and I manually attempt to start CryoSPARC). I have also tried temporarily disabling the firewall, but that doesn’t help.

When I look at CryoSPARC status, it is giving me an error about not being able to get the license status. But I don’t think it’s a problem with the license, I’m guessing that it all stems from this issue of the database…

[cryosparc_user@hostname cryosparc]$ cryosparcm status
----------------------------------------------------------------------------
CryoSPARC System master node installed at
/home/cryosparc_user/cryosparc/cryosparc_master
Current cryoSPARC version: v4.1.2
----------------------------------------------------------------------------

CryoSPARC process status:

app                              STOPPED   Not started
app_api                          STOPPED   Not started
app_api_dev                      STOPPED   Not started
app_legacy                       STOPPED   Not started
app_legacy_dev                   STOPPED   Not started
command_core                     STOPPED   Not started
command_rtp                      STOPPED   Not started
command_vis                      STOPPED   Not started
database                         RUNNING   pid 114152, uptime 0:02:41

----------------------------------------------------------------------------
*** CommandClient: (http://hostname.redacted.emory.edu:39002/api) URL Error [Errno 111] Connection refused
An error ocurred while checking license status
Could not get license verification status. Are all CryoSPARC processes RUNNING?

Any suggestions are MUCHLY appreciated, as I’m stuck as to where to go from here…

wtempel · February 9, 2023, 7:16pm

What is the output when you run
curl localhost:39001
on the master node?

Billy · February 10, 2023, 12:39pm

Same as what I get when I point a browser at it:

[cryosparc_user@hostname cryosparc]$ curl localhost:39001
It looks like you are trying to access MongoDB over HTTP on the native driver port.

And if I stop CryoSPARC (which isn’t REALLY quite running, even though it says it is when I tell it to stop):

[cryosparc_user@hostname cryosparc]$ cryosparcm stop
CryoSPARC is running.
Stopping cryoSPARC
database: stopped
Shut down
[cryosparc_user@hostname cryosparc]$ curl localhost:39001
curl: (7) Failed connect to localhost:39001; Connection refused