Jobs Fail with Exit Code 35

In running another job, this time it seems to have created a job.log file with a traceback error:



================= CRYOSPARCW =======  2025-06-24 21:20:54.659810  =========
Project P1 Job J3
Master polaris.alcf.anl.gov Port 18002
===========================================================================
MAIN PROCESS PID 1211599
========= now starting main process at 2025-06-24 21:20:54.660961
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "cryosparc_master/cryosparc_compute/run.py", line 201, in cryosparc_master.cryosparc_compute.run.run
  File "cryosparc_master/cryosparc_compute/run.py", line 255, in cryosparc_master.cryosparc_compute.run.run
  File "cryosparc_master/cryosparc_compute/run.py", line 50, in cryosparc_master.cryosparc_compute.run.main
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 141, in connect
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 141, in connect
    db = usedb if usedb is not None else database_management.get_pymongo_client('meteor')['meteor']
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/cryosparc_compute/database_management.py", line 221, in get_pymongo_client
    db = usedb if usedb is not None else database_management.get_pymongo_client('meteor')['meteor']
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/cryosparc_compute/database_management.py", line 221, in get_pymongo_client
    assert client[database_name].list_collection_names() is not None
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/database.py", line 1154, in list_collection_names
    assert client[database_name].list_collection_names() is not None
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/database.py", line 1154, in list_collection_names
    return [result["name"] for result in self.list_collections(session=session, **kwargs)]
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/database.py", line 1105, in list_collections
    return [result["name"] for result in self.list_collections(session=session, **kwargs)]
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/database.py", line 1105, in list_collections
    return self.__client._retryable_read(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1540, in _retryable_read
    return self.__client._retryable_read(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1540, in _retryable_read
    return self._retry_internal(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/_csot.py", line 108, in csot_wrapper
    return self._retry_internal(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/_csot.py", line 108, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1507, in _retry_internal
    return func(self, *args, **kwargs)
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1507, in _retry_internal
    ).run()
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 2353, in run
    ).run()
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 2353, in run
    return self._read() if self._is_read else self._write()
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 2483, in _read
    return self._read() if self._is_read else self._write()
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 2483, in _read
    self._server = self._get_server()
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 2439, in _get_server
    self._server = self._get_server()
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 2439, in _get_server
    return self._client._select_server(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1322, in _select_server
    server = topology.select_server(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 368, in select_server
    return self._client._select_server(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1322, in _select_server
    server = topology.select_server(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 368, in select_server
    server = self._select_server(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 346, in _select_server
    server = self._select_server(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 346, in _select_server
    servers = self.select_servers(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 253, in select_servers
    servers = self.select_servers(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 253, in select_servers
    server_descriptions = self._select_servers_loop(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 303, in _select_servers_loop
    server_descriptions = self._select_servers_loop(
  File "/lus/eagle/projects/FoundEpidem/aravi/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/pymongo/topology.py", line 303, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: polaris.alcf.anl.gov:18001: [Errno 101] Network is unreachable (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30.0s, Topology Description: <TopologyDescription id: 685b16c35c399377def0d5cb, topology_type: Single, servers: [<ServerDescription ('polaris.alcf.anl.gov', 18001) server_type: Unknown, rtt: None, error=AutoReconnect('polaris.alcf.anl.gov:18001: [Errno 101] Network is unreachable (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: polaris.alcf.anl.gov:18001: [Errno 101] Network is unreachable (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30.0s, Topology Description: <TopologyDescription id: 685b16c32ff1f2ea97413170, topology_type: Single, servers: [<ServerDescription ('polaris.alcf.anl.gov', 18001) server_type: Unknown, rtt: None, error=AutoReconnect('polaris.alcf.anl.gov:18001: [Errno 101] Network is unreachable (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>

Entering the commandcurl polaris.alcf.anl.gov:18001 into the terminal shows the message: It looks like you are trying to access MongoDB over HTTP on the native driver port, as expected. I’m not too sure what could cause this connection issue, as there is passwordless ssh set up across all nodes, and the hostnames are configured correctly.