Hello,
Our CryoSPARC instance crashed due to the storage folder being full. We had had this issue in the past but it had solved itself when increasing storage allocation. This time however I get stuck on “Warning: Could not get database status (attempt 3/3) cryosparc”
fyi : There is another group using CryoSPARC on the cluster (GRUBER), we are Pnavarr1.
CryoSPARC instance information
Type : Cluster
(base) [agregor@curnagl cryosparc_master]$ cryosparcm status
CryoSPARC System master node installed at
/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master
Current cryoSPARC version: v4.6.2
CryoSPARC process status:
app STOPPED Not started
app_api STOPPED Not started
app_api_dev STOPPED Not started
command_core STOPPED Not started
command_rtp STOPPED Not started
command_vis STOPPED Not started
database STOPPED Not started
License is valid
global config variables:
export CRYOSPARC_LICENSE_ID="XXXXXXXXXXX"
export CRYOSPARC_MASTER_HOSTNAME="curnagl"
export CRYOSPARC_DB_PATH="/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/database"
export CRYOSPARC_BASE_PORT=45031
export CRYOSPARC_DB_CONNECTION_TIMEOUT_MS=20000
export CRYOSPARC_INSECURE=false
export CRYOSPARC_DB_ENABLE_AUTH=true
export CRYOSPARC_CLUSTER_JOB_MONITOR_INTERVAL=10
export CRYOSPARC_CLUSTER_JOB_MONITOR_MAX_RETRIES=1000000
export CRYOSPARC_PROJECT_DIR_PREFIX='CS-'
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_CLICK_WRAP=true
(base) [agregor@curnagl cryosparc_master]$ uname -a && free -g
Linux curnagl 5.14.0-427.37.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 13 12:41:50 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
total used free shared buff/cache available
Mem: 503 153 45 9 316 349
Swap: 0 0 0
CryoSPARC worker environment
(base) [agregor@curnagl ~] eval $(/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/bin/cryosparcw env)
env | grep PATH
/sbin/ldconfig -p | grep -i cuda
uname -a
free -g
nvidia-smi
commands below only on CryoSPARC versions older than v4.4
which nvcc
nvcc --version
python -c "import pycuda.driver; print(pycuda.driver.get_version())"
STACK_20240704_MODULEPATH=/dcsrsoft/spack//20240704/spack/opt/modules/Core:/dcsrsoft/spack//20240704/spack/opt/modules/gcc/11.4.0
STACK_20241118_MODULEPATH=/dcsrsoft/spack//20241118/spack/opt/modules/Core:/dcsrsoft/spack//20241118/spack/opt/modules/gcc/12.3.0:/dcsrsoft/spack//20241118/spack/opt/modules/gcc/9.5.0
CRYOSPARC_PATH=/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/bin
__LMOD_REF_COUNT_MODULEPATH=/dcsrsoft/spack/20241118/spack/opt/modules/Core:1;/dcsrsoft/spack/20241118/spack/opt/modules/gcc/12.3.0:1;/dcsrsoft/spack/20241118/spack/opt/modules/gcc/9.5.0:1
MANPATH=/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j/share/man::
__LMOD_REF_COUNT_CMAKE_PREFIX_PATH=/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j:1
__LMOD_REF_COUNT_PATH=/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/cryptsetup-2.3.5-mge72p7wl35jtj3ejpgryy6xa6ujtmmt/sbin:1;/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j/bin:1;/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/bin:1;/work/FAC/FBM/DMF/pnavarr1/default/tools/miniconda/bin:1;/work/FAC/FBM/DMF/pnavarr1/default/tools/miniconda/condabin:1;/users/agregor/.local/bin:1;/users/agregor/bin:1;/usr/lpp/mmfs/bin:1;/usr/local/bin:1;/usr/bin:1;/usr/local/sbin:1;/usr/sbin:1;/dcsrsoft/bin:1
__LMOD_REF_COUNT_GOPATH=/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j:1
CMAKE_PREFIX_PATH=/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j
PYTHONPATH=/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker
NUMBA_CUDA_INCLUDE_PATH=/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/include
STACK_20240303_MODULEPATH=/dcsrsoft/spack//20240303/spack/opt/modules/Core:/dcsrsoft/spack//20240303/spack/opt/modules/gcc/11.4.0
__LMOD_REF_COUNT_MANPATH=/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j/share/man:1;:1
LD_LIBRARY_PATH=/work/FAC/FBM/DMF/pnavarr1/default/tools/cuda/usr/local/cuda-12.6/lib64:
PATH=/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/bin:/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/deps/anaconda/condabin:/work/FAC/FBM/DMF/pnavarr1/default/tools/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/CTFfind5/cisTEM:/work/FAC/FBM/DMF/pnavarr1/default/tools/cryocare/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/cuda/usr/local/cuda-12.6/bin:/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/cryptsetup-2.3.5-mge72p7wl35jtj3ejpgryy6xa6ujtmmt/sbin:/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j/bin:/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/miniconda/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/miniconda/condabin:/users/agregor/.local/bin:/users/agregor/bin:/usr/lpp/mmfs/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/dcsrsoft/bin
MODULEPATH=/dcsrsoft/spack/20241118/spack/opt/modules/Core:/dcsrsoft/spack/20241118/spack/opt/modules/gcc/12.3.0:/dcsrsoft/spack/20241118/spack/opt/modules/gcc/9.5.0
GOPATH=/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j
libicudata.so.67 (libc6,x86-64) => /lib64/libicudata.so.67
libcuda_wrapper.so.0 (libc6,x86-64) => /lib64/libcuda_wrapper.so.0
libcuda_wrapper.so (libc6,x86-64) => /lib64/libcuda_wrapper.so
Linux curnagl 5.14.0-427.37.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 13 12:41:50 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
total used free shared buff/cache available
Mem: 503 153 45 9 316 349
Swap: 0 0 0
-bash: nvidia-smi: command not found
/usr/bin/which: no nvcc in (/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/bin:/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_worker/deps/anaconda/condabin:/work/FAC/FBM/DMF/pnavarr1/default/tools/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/CTFfind5/cisTEM:/work/FAC/FBM/DMF/pnavarr1/default/tools/cryocare/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/cuda/usr/local/cuda-12.6/bin:/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/cryptsetup-2.3.5-mge72p7wl35jtj3ejpgryy6xa6ujtmmt/sbin:/dcsrsoft/spack/20241118/spack/opt/spack/linux-rhel9-zen2/gcc-12.3.0/singularityce-4.1.0-mt3k5udjdeyhxtvkci4sgwuialkaln2j/bin:/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/miniconda/bin:/work/FAC/FBM/DMF/pnavarr1/default/tools/miniconda/condabin:/users/agregor/.local/bin:/users/agregor/bin:/usr/lpp/mmfs/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/dcsrsoft/bin)
-bash: nvcc: command not found
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'pycuda'
in /run/database.log i see:
2025-03-14T10:34:08.469+0100 I CONTROL [initandlisten] MongoDB starting : pid=1476778 port=45032 dbpath=/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/database 64-bit host=curnagl
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] db version v3.6.23
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] git version: d352e6a4764659e0d0350ce77279de3c1f243e5c
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] allocator: tcmalloc
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] modules: none
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] build environment:
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] distarch: x86_64
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] target_arch: x86_64
2025-03-14T10:34:08.482+0100 I CONTROL [initandlisten] options: { net: { port: 45032 }, replication: { oplogSizeMB: 64, replSet: "meteor" }, storage: { dbPath: "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/database" } }
2025-03-14T10:34:08.498+0100 W - [initandlisten] Detected unclean shutdown - /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/database/mongod.lock is not empty.
2025-03-14T10:34:08.498+0100 E STORAGE [initandlisten] Failed to set up listener: SocketException: Address already in use
2025-03-14T10:34:08.499+0100 I CONTROL [initandlisten] now exiting
2025-03-14T10:34:08.499+0100 I CONTROL [initandlisten] shutting down with code:48
I moved “mongod.lock” to another folder. Now after trying cryosparcm start I get:
Starting CryoSPARC System master process...
CryoSPARC is not already running.
configuring database...
Warning: Could not get database status (attempt 1/3)
Warning: Could not get database status (attempt 2/3)
Warning: Could not get database status (attempt 3/3)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/cryosparc_compute/database_management.py", line 47, in configure_mongo
initialize_replica_set()
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/cryosparc_compute/database_management.py", line 84, in initialize_replica_set
admin_db = try_get_pymongo_db(mongo_client)
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/cryosparc_compute/database_management.py", line 251, in try_get_pymongo_db
admin_db.command(({'serverStatus': 1}))
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/_csot.py", line 108, in csot_wrapper
return func(self, *args, **kwargs)
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/database.py", line 893, in command
with self.__client._conn_for_reads(read_preference, session, operation=command_name) as (
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1375, in _conn_for_reads
server = self._select_server(read_preference, session, operation)
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1322, in _select_server
server = topology.select_server(
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 368, in select_server
server = self._select_server(
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 346, in _select_server
servers = self.select_servers(
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 253, in select_servers
server_descriptions = self._select_servers_loop(
File "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/pymongo/topology.py", line 303, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: localhost:45032: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 20.0s, Topology Description: <TopologyDescription id: 67d40426802ff195f5b95076, topology_type: Single, servers: [<ServerDescription ('localhost', 45032) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:45032: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>
[2025-03-14T11:26:53+01:00] Error configuring database. Most recent database log lines:
2025-03-14T11:25:40.320+0100 I CONTROL [initandlisten] git version: d352e6a4764659e0d0350ce77279de3c1f243e5c
2025-03-14T11:25:40.320+0100 I CONTROL [initandlisten] allocator: tcmalloc
2025-03-14T11:25:40.320+0100 I CONTROL [initandlisten] modules: none
2025-03-14T11:25:40.320+0100 I CONTROL [initandlisten] build environment:
2025-03-14T11:25:40.320+0100 I CONTROL [initandlisten] distarch: x86_64
2025-03-14T11:25:40.320+0100 I CONTROL [initandlisten] target_arch: x86_64
2025-03-14T11:25:40.320+0100 I CONTROL [initandlisten] options: { net: { port: 45032 }, replication: { oplogSizeMB: 64, replSet: "meteor" }, storage: { dbPath: "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/database" } }
2025-03-14T11:25:40.341+0100 E STORAGE [initandlisten] Failed to set up listener: SocketException: Address already in use
2025-03-14T11:25:40.341+0100 I CONTROL [initandlisten] now exiting
2025-03-14T11:25:40.341+0100 I CONTROL [initandlisten] shutting down with code:48
From /run/database.log when the storage issue appeared :
2025-03-13T11:59:46.377+0100 I NETWORK [conn1734] received client metadata from 10.203.101.85:41108 conn1734: { driver: { name: "PyMongo", version: "4.8.0" }, os: { type: "Linux", name: "Linux", architecture: "x86_64", version: "5.14.0-427.37.1.el9_4.x86_64" }, platform: "CPython 3.10.14.final.0" }
2025-03-13T11:59:46.377+0100 I NETWORK [conn1735] received client metadata from 10.203.101.85:41122 conn1735: { driver: { name: "PyMongo", version: "4.8.0" }, os: { type: "Linux", name: "Linux", architecture: "x86_64", version: "5.14.0-427.37.1.el9_4.x86_64" }, platform: "CPython 3.10.14.final.0" }
2025-03-13T11:59:46.382+0100 I ACCESS [conn1735] Successfully authenticated as principal cryosparc_user on admin from client 10.203.101.85:41122
2025-03-13T11:59:46.382+0100 I ACCESS [conn1734] Successfully authenticated as principal cryosparc_user on admin from client 10.203.101.85:41108
2025-03-13T12:00:00.334+0100 I STORAGE [WT RecordStoreThread: local.oplog.rs] WiredTiger record store oplog truncation finished in: 3ms
2025-03-13T12:02:27.724+0100 I STORAGE [WT RecordStoreThread: local.oplog.rs] WiredTiger record store oplog truncation finished in: 2ms
2025-03-13T12:05:13.765+0100 I STORAGE [WT RecordStoreThread: local.oplog.rs] WiredTiger record store oplog truncation finished in: 1ms
2025-03-13T12:08:34.152+0100 I STORAGE [WT RecordStoreThread: local.oplog.rs] WiredTiger record store oplog truncation finished in: 2ms
2025-03-13T12:10:11.816+0100 E STORAGE [WTCheckpointThread] WiredTiger error (122) [1741864211:685065][2569531:0x7fcb59dbd640], file:collection-115-2422639577907128585.wt, WT_SESSION.checkpoint: __posix_file_write, 579: /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/database/collection-115-2422639577907128585.wt: handle-write: pwrite: failed to write 278528 bytes at offset 4567437312: Disk quota exceeded Raw: [1741864211:685065][2569531:0x7fcb59dbd640], file:collection-115-2422639577907128585.wt, WT_SESSION.checkpoint: __posix_file_write, 579: /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/database/collection-115-2422639577907128585.wt: handle-write: pwrite: failed to write 278528 bytes at offset 4567437312: Disk quota exceeded
2025-03-13T12:10:11.848+0100 E STORAGE [WTCheckpointThread] WiredTiger error (22) [1741864211:848416][2569531:0x7fcb59dbd640], file:index-116-2422639577907128585.wt, WT_SESSION.checkpoint: __wt_block_checkpoint_resolve, 859: index-116-2422639577907128585.wt: the checkpoint failed, the system must restart: Invalid argument Raw: [1741864211:848416][2569531:0x7fcb59dbd640], file:index-116-2422639577907128585.wt, WT_SESSION.checkpoint: __wt_block_checkpoint_resolve, 859: index-116-2422639577907128585.wt: the checkpoint failed, the system must restart: Invalid argument
2025-03-13T12:10:11.848+0100 E STORAGE [WTCheckpointThread] WiredTiger error (-31804) [1741864211:848455][2569531:0x7fcb59dbd640], file:index-116-2422639577907128585.wt, WT_SESSION.checkpoint: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic Raw: [1741864211:848455][2569531:0x7fcb59dbd640], file:index-116-2422639577907128585.wt, WT_SESSION.checkpoint: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic
2025-03-13T12:10:11.848+0100 F - [WTCheckpointThread] Fatal Assertion 50853 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 420
2025-03-13T12:10:11.848+0100 F - [WTCheckpointThread] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:11.875+0100 F - [WTJournalFlusher] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:11.875+0100 F - [WTJournalFlusher] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:11.948+0100 F - [conn73] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:11.948+0100 F - [conn73] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:11.989+0100 F - [conn1729] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:11.989+0100 F - [conn1729] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:11.998+0100 F - [conn7] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:11.998+0100 F - [conn7] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:12.000+0100 F - [ftdc] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:12.000+0100 F - [ftdc] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:12.054+0100 F - [conn1708] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:12.054+0100 F - [conn1708] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:12.166+0100 F - [conn1683] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:12.166+0100 F - [conn1683] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:12.166+0100 F - [conn1686] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 74
2025-03-13T12:10:12.166+0100 F - [conn1686] \n\n***aborting after fassert() failure\n\n
2025-03-13T12:10:12.292+0100 F - [WTCheckpointThread] Got signal: 6 (Aborted).
0x55a4f763ef21 0x55a4f763e139 0x55a4f763e61d 0x7fcb5fe126f0 0x7fcb5fe5f94c 0x7fcb5fe12646 0x7fcb5fdfc7f3 0x55a4f5d22dec 0x55a4f5dfdd76 0x55a4f5e6fad1 0x55a4f5cbfa94 0x55a4f5cbfeb4 0x55a4f5f42695 0x55a4f5e31eb2 0x55a4f5e82b0e 0x55a4f5e83953 0x55a4f5e68f8a 0x55a4f5de0193 0x55a4f75287c0 0x55a4f774fc10 0x7fcb5fe5dc02 0x7fcb5fee2c40
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"55A4F5399000","o":"22A5F21","s":"_ZN5mongo15printStackTraceERSo"},{"b":"55A4F5399000","o":"22A5139"},{"b":"55A4F5399000","o":"22A561D"},{"b":"7FCB5FDD4000","o":"3E6F0"},{"b":"7FCB5FDD4000","o":"8B94C"},{"b":"7FCB5FDD4000","o":"3E646","s":"raise"},{"b":"7FCB5FDD4000","o":"287F3","s":"abort"},{"b":"55A4F5399000","o":"989DEC","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"55A4F5399000","o":"A64D76"},{"b":"55A4F5399000","o":"AD6AD1"},{"b":"55A4F5399000","o":"926A94","s":"__wt_err_func"},{"b":"55A4F5399000","o":"926EB4","s":"__wt_panic"},{"b":"55A4F5399000","o":"BA9695","s":"__wt_block_checkpoint_resolve"},{"b":"55A4F5399000","o":"A98EB2","s":"__wt_meta_track_off"},{"b":"55A4F5399000","o":"AE9B0E"},{"b":"55A4F5399000","o":"AEA953","s":"__wt_txn_checkpoint"},{"b":"55A4F5399000","o":"ACFF8A"},{"b":"55A4F5399000","o":"A47193","s":"_ZN5mongo18WiredTigerKVEngine26WiredTigerCheckpointThread3runEv"},{"b":"55A4F5399000","o":"218F7C0","s":"_ZN5mongo13BackgroundJob7jobBodyEv"},{"b":"55A4F5399000","o":"23B6C10"},{"b":"7FCB5FDD4000","o":"89C02"},{"b":"7FCB5FDD4000","o":"10EC40"}],"processInfo":{ "mongodbVersion" : "3.6.23", "gitVersion" : "d352e6a4764659e0d0350ce77279de3c1f243e5c", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "5.14.0-427.37.1.el9_4.x86_64", "version" : "#1 SMP PREEMPT_DYNAMIC Fri Sep 13 12:41:50 EDT 2024", "machine" : "x86_64" }, "somap" : [ { "b" : "55A4F5399000", "elfType" : 3, "buildId" : "B0818C001F2B63D4533D208F68F08AE2A599CA9E" }, { "b" : "7FFE363FC000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "B78F3F86198BFC7FBE33898DEDE69799CBE8530D" }, { "b" : "7FCB60101000", "path" : "/work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/libpython3.10.so", "elfType" : 3 }, { "b" : "7FCB600E4000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "8E61C4327C4D5757C08D7FA962EFD71683EBBFDC" }, { "b" : "7FCB600DF000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "1A11E03063E9160803AB6A87CECE2AD25346F20F" }, { "b" : "7FCB600DA000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "CE96631C1B7EA31412DA4D6E3C735BBCEE781C9D" }, { "b" : "7FCB5FFFF000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "BCE2C9260F603AB2C12D8EE28632B13D43C8AE61" }, { "b" : "7FCB5FFE2000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "EF4C928F1372AD155FEA761F0E840ECD264FB153" }, { "b" : "7FCB5FFDD000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "9A19B839A71005671BC715C96CB4FB040B4649E9" }, { "b" : "7FCB5FDD4000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8C3B90B6DFAC32E7E7DA24C75B450EF3BE7D48DA" }, { "b" : "7FCB604A0000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "A42D94EABFE701AC16B767E5971B6D08FCB01DF8" }, { "b" : "7FCB5FDCF000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "C231C69EC0248CD17D9B3A1E2883A0386DDA53FB" } ] }}
mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x55a4f763ef21]
mongod(+0x22A5139) [0x55a4f763e139]
mongod(+0x22A561D) [0x55a4f763e61d]
libc.so.6(+0x3E6F0) [0x7fcb5fe126f0]
libc.so.6(+0x8B94C) [0x7fcb5fe5f94c]
libc.so.6(raise+0x16) [0x7fcb5fe12646]
libc.so.6(abort+0xD3) [0x7fcb5fdfc7f3]
mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x55a4f5d22dec]
mongod(+0xA64D76) [0x55a4f5dfdd76]
mongod(+0xAD6AD1) [0x55a4f5e6fad1]
mongod(__wt_err_func+0x90) [0x55a4f5cbfa94]
mongod(__wt_panic+0x3F) [0x55a4f5cbfeb4]
mongod(__wt_block_checkpoint_resolve+0x145) [0x55a4f5f42695]
mongod(__wt_meta_track_off+0x312) [0x55a4f5e31eb2]
mongod(+0xAE9B0E) [0x55a4f5e82b0e]
mongod(__wt_txn_checkpoint+0x1C3) [0x55a4f5e83953]
mongod(+0xACFF8A) [0x55a4f5e68f8a]
mongod(_ZN5mongo18WiredTigerKVEngine26WiredTigerCheckpointThread3runEv+0x243) [0x55a4f5de0193]
mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x160) [0x55a4f75287c0]
mongod(+0x23B6C10) [0x55a4f774fc10]
libc.so.6(+0x89C02) [0x7fcb5fe5dc02]
libc.so.6(+0x10EC40) [0x7fcb5fee2c40]
----- END BACKTRACE -----
The output of “ps ax -U $(whoami) | grep mongod” (CRYOEM_GRUBER IS ANOTHER GROUP ON THE CLUSTER, IT IS NOT MY CRYOSPARC INSTALL. WE ARE PNAVARR1) :
(base) [agregor@curnagl cryosparc_master] ps ax -U (whoami) | grep mongod
478581 ? Sl 2:20 mongod --auth --dbpath /work/FAC/FBM/DMF/sgruber1/cryoem_gruber/cryosparc/database --port 45002 --oplogSize 64 --replSet meteor --wiredTigerCacheSizeGB 4 --bind_ip_all
1871659 pts/307 S+ 0:00 grep --color=auto mongod
Other debugging outputs :
(base) [agregor@curnagl cryosparc_master] whoami
agregor
(base) [agregor@curnagl cryosparc_master] stat bin/cryosparcm
File: bin/cryosparcm
Size: 76852 Blocks: 160 IO Block: 4194304 regular file
Device: 31h/49d Inode: 212937732 Links: 1
Access: (0755/-rwxr-xr-x) Uid: (225181/ agregor) Gid: (183921/pi_pnavarr1_101419-pr-g)
Access: 2025-03-14 10:29:46.130245254 +0100
Modify: 2024-11-18 16:19:01.000000000 +0100
Change: 2025-02-04 15:41:32.642020992 +0100
Birth: -
(base) [agregor@curnagl cryosparc_master] hostname
curnagl
(base) [agregor@curnagl cryosparc_master] grep curnagl config.sh
export CRYOSPARC_MASTER_HOSTNAME="curnagl"
(base) [agregor@curnagl cryosparc_master] ps xww | grep -e cryosparc -e mongo
660108 ? Sl 17:41 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/gunicorn cryosparc_command.command_vis:app -n command_vis -b 0.0.0.0:45034 -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/gunicorn.conf.py
1851839 ? Ss 0:00 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/supervisord.conf
1938537 pts/307 S+ 0:00 grep --color=auto -e cryosparc -e mongo
2569007 ? Ss 5:00 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/supervisord.conf
2569835 ? S 2:23 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/gunicorn -n command_core -b 0.0.0.0:45033 cryosparc_command.command_core:start() -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/gunicorn.conf.py
2569985 ? Sl 71:03 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/gunicorn -n command_core -b 0.0.0.0:45033 cryosparc_command.command_core:start() -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/gunicorn.conf.py
2570302 ? S 2:21 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/gunicorn cryosparc_command.command_vis:app -n command_vis -b 0.0.0.0:45034 -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/gunicorn.conf.py
2570344 ? S 2:10 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/gunicorn cryosparc_command.command_rtp:start() -n command_rtp -b 0.0.0.0:45036 -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/gunicorn.conf.py
2570355 ? Sl 68:05 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/gunicorn cryosparc_command.command_rtp:start() -n command_rtp -b 0.0.0.0:45036 -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/gunicorn.conf.py
2570880 ? Sl 87:21 /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/cryosparc_app/nodejs/bin/node ./bundle/main.js
3886350 ? Ss 0:08 python /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /work/FAC/FBM/DMF/pnavarr1/default/CryoSPARC/cryosparc_master/supervisord.conf
Also
(base) [agregor@curnagl cryosparc_master]$ curl localhost:45032
curl: (7) Failed to connect to localhost port 45032: Connection refused
And
(base) [agregor@curnagl cryosparc_master]$ cryosparcm log webapp
Invalid service: webapp
Usage:
cryosparcm log SERVICE
Where SERVICE is one of:
app
app_api
command_core
command_rtp
command_vis
database
supervisord
I appreciate any and all help. Thank you very much.
Best,
Aurélien.