CryoSPARC instance information
- Type: cluster
- Software version
- $ uname -a && free -g
Linux cryosparc-prod 3.10.0-1160.119.1.el7.tuxcare.els13.x86_64 #1 SMP Fri Nov 22 06:29:45 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
total used free shared buff/cache available
Mem: 46 1 41 0 4 45
Swap: 1 0 1
###Issue
- While attempting a ‘cryosparcm start’ we get the following error. There were no other user processes running, and we ran ‘cryosparcm start/stop’ commands using the account cryoSPARC was built with.
:
[lab_name@cryosparc-prod bin]$ ./cryosparcm start
Starting CryoSPARC System master process...
CryoSPARC is not already running.
configuring database...
Warning: Could not get database status (attempt 1/3)
Warning: Could not get database status (attempt 2/3)
Warning: Could not get database status (attempt 3/3)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/path/to/www/cryosparc/cryosparc2_master/cryosparc_compute/database_management.py", line 47, in configure_mongo
initialize_replica_set()
File "/path/to/www/cryosparc/cryosparc2_master/cryosparc_compute/database_management.py", line 84, in initialize_replica_set
admin_db = try_get_pymongo_db(mongo_client)
File "/path/to/www/cryosparc/cryosparc2_master/cryosparc_compute/database_management.py", line 251, in try_get_pymongo_db
admin_db.command(({'serverStatus': 1}))
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/_csot.py", line 108, in csot_wrapper
return func(self, *args, **kwargs)
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/database.py", line 893, in command
with self.__client._conn_for_reads(read_preference, session, operation=command_name) as (
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1375, in _conn_for_reads
server = self._select_server(read_preference, session, operation)
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1322, in _select_server
server = topology.select_server(
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 368, in select_server
server = self._select_server(
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 346, in _select_server
servers = self.select_servers(
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 253, in select_servers
server_descriptions = self._select_servers_loop(
File "/path/to/www/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 303, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: localhost:8622: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 20.0s, Topology Description: <TopologyDescription id: 678a62c896279ba04a16fbf1, topology_type: Single, servers: [<ServerDescription ('localhost', 8622) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:8622: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>
[2025-01-17T09:02:55-0500] Error configuring database. Most recent database log lines:
2025-01-17T09:01:41.802-0500 I - [initandlisten] Detected data files in /path/to/db/cryosparc_db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2025-01-17T09:01:41.805-0500 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=23548M,cache_overflow=(file_max=0M),session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),compatibility=(release="3.0",require_max="3.0"),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2025-01-17T09:01:42.601-0500 E STORAGE [initandlisten] WiredTiger error (11) [1737122502:601419][120298:0x7f9294e84a40], wiredtiger_open: __posix_file_lock, 410: /path/to/db/cryosparc_db/WiredTiger.lock: handle-lock: fcntl: Resource temporarily unavailable Raw: [1737122502:601419][120298:0x7f9294e84a40], wiredtiger_open: __posix_file_lock, 410: /path/to/db/cryosparc_db/WiredTiger.lock: handle-lock: fcntl: Resource temporarily unavailable
2025-01-17T09:01:42.601-0500 E STORAGE [initandlisten] WiredTiger error (16) [1737122502:601478][120298:0x7f9294e84a40], wiredtiger_open: __conn_single, 1720: WiredTiger database is already being managed by another process: Device or resource busy Raw: [1737122502:601478][120298:0x7f9294e84a40], wiredtiger_open: __conn_single, 1720: WiredTiger database is already being managed by another process: Device or resource busy
2025-01-17T09:01:42.601-0500 E - [initandlisten] Assertion: 28595:16: Device or resource busy src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 488
2025-01-17T09:01:42.602-0500 I STORAGE [initandlisten] exception in initAndListen: Location28595: 16: Device or resource busy, terminating
2025-01-17T09:01:42.603-0500 I NETWORK [initandlisten] shutdown: going to close listening sockets...
2025-01-17T09:01:42.603-0500 I NETWORK [initandlisten] removing socket file: /tmp/mongodb-8622.sock
2025-01-17T09:01:42.603-0500 I CONTROL [initandlisten] now exiting
2025-01-17T09:01:42.603-0500 I CONTROL [initandlisten] shutting down with code:100
- I want to include the following since we saw it was requested in a similar (but not the same) thread(s) found here :
grep -v LICENSE_ID /n/www/cryosparc-lab_name.cluster.school.edu/cryosparc2_master/config.sh
ps -eo user:12,pid,ppid,start,command | grep -e cryosparc_ -e mongo
ls -l /tmp/mongo*.sock /tmp/cryosparc*.sock /path/to/lab_name/db/cryosparc_db/WiredTiger.lock
export CRYOSPARC_MASTER_HOSTNAME="cryosparc-prod.cluster.school.edu"
export CRYOSPARC_DB_PATH="/path/to/lab_name/db/cryosparc_db"
export CRYOSPARC_BASE_PORT=8621
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_INSECURE=false
export CRYOSPARC_CLICK_WRAP=true
export CRYOSPARC_FORCE_HOSTNAME=true
export CRYOSPARC_SSD_PATH=/n/scratch/users/l/lab_name
lab_name 3171 3168 09:26:42 grep -e cryosparc_ -e mongo
-rw-rw-r-- 1 lab_name lab_name 21 Mar 30 2021 /path/to/lab_name/db/cryosparc_db/WiredTiger.lock
srwx------ 1 lab2_name lab2_name 0 Jan 10 13:23 /tmp/cryosparc-supervisor-50c2a9c444f39d0cdea59524ee190f8c.sock
srwx------ 1 lab3_name lab3_name 0 Jan 9 17:16 /tmp/cryosparc-supervisor-cd41cca6e6fba8d21b1c763548378f8e.sock
srwx------ 1 lab3_name lab3_name 0 Jan 9 17:16 /tmp/mongodb-8602.sock
srwx------ 1 lab2_name lab2_name 0 Jan 10 13:24 /tmp/mongodb-8672.sock
2025-01-09 17:16:20,162 INFO supervisord started with pid 8856
2025-01-09 17:16:34,023 INFO spawned: 'database' with pid 10453
2025-01-09 17:16:35,087 INFO success: database entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-01-09 17:16:42,337 INFO spawned: 'command_core' with pid 11357
2025-01-09 17:16:48,300 INFO success: command_core entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2025-01-09 17:17:00,906 INFO spawned: 'command_vis' with pid 11865
2025-01-09 17:17:01,915 INFO success: command_vis entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-01-09 17:17:02,712 INFO spawned: 'command_rtp' with pid 12050
2025-01-09 17:17:03,714 INFO success: command_rtp entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-01-09 17:17:22,553 INFO spawned: 'app' with pid 12606
2025-01-09 17:17:23,567 INFO success: app entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-01-09 17:17:25,657 INFO spawned: 'app_api' with pid 12708
2025-01-09 17:17:26,654 INFO success: app_api entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-01-16 11:26:27,593 INFO RPC interface 'supervisor' initialized
2025-01-16 11:26:27,593 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2025-01-16 11:26:27,595 INFO daemonizing the supervisord process
2025-01-16 11:26:27,609 INFO supervisord started with pid 2419
2025-01-16 11:27:47,700 WARN received SIGTERM indicating exit request
2025-01-16 13:48:01,744 INFO RPC interface 'supervisor' initialized
2025-01-16 13:48:01,744 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2025-01-16 13:48:01,745 INFO daemonizing the supervisord process
2025-01-16 13:48:01,760 INFO supervisord started with pid 74475
2025-01-16 13:52:20,716 INFO RPC interface 'supervisor' initialized
2025-01-16 13:52:20,716 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2025-01-16 13:52:20,717 INFO daemonizing the supervisord process
2025-01-16 13:52:20,738 INFO supervisord started with pid 76000
2025-01-16 14:16:19,096 INFO RPC interface 'supervisor' initialized
2025-01-16 14:16:19,096 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2025-01-16 14:16:19,098 INFO daemonizing the supervisord process
2025-01-16 14:16:19,115 INFO supervisord started with pid 89894
2025-01-16 14:19:19,321 WARN received SIGTERM indicating exit request
2025-01-16 14:19:58,809 INFO RPC interface 'supervisor' initialized
2025-01-16 14:19:58,809 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2025-01-16 14:19:58,811 INFO daemonizing the supervisord process
2025-01-16 14:19:58,821 INFO supervisord started with pid 91193
2025-01-17 09:01:39,859 INFO RPC interface 'supervisor' initialized
2025-01-17 09:01:39,859 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2025-01-17 09:01:39,860 INFO daemonizing the supervisord process
2025-01-17 09:01:39,876 INFO supervisord started with pid 120269
2025-01-17 09:14:06,738 WARN received SIGTERM indicating exit request