Similiar to this other thread we see the below error:
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | Failed to get GPU info on ourcomaind.edu
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | Traceback (most recent call last):
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | File "/home/me/cryosparc_master/cryosparc_command/command_core/__init__.py", line 1516, in get_gpu_info_run
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | value = subprocess.check_output(full_command, stderr=subprocess.STDOUT, shell=shell, timeout=JOB_LAUNCH_TIMEOUT_SECONDS).decode()
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | File "/home/me/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/subprocess.py", line 421, in check_output
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | File "/home/me/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/subprocess.py", line 526, in run
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | raise CalledProcessError(retcode, process.args,
2024-09-26 13:17:58,266 get_gpu_info_run ERROR | subprocess.CalledProcessError: Command '['ssh', 'me@ourcomaind.edu', 'bash -c "eval $(/opt/cryosparc_worker/bin/cryosparcw env); python /opt/cryosparc_worker/cryosparc_compute/get_gpu_info.py"']' returned non-zero exit status 255.
2024-09-26 13:18:06,489 update_all_job_sizes_run INFO | Finished updating all job sizes (0 jobs updated, 0 projects updated)
2024-09-26 13:18:11,024 wrapper ERROR | JSONRPC ERROR at set_user_viewed_workspace
2024-09-26 13:18:11,024 wrapper ERROR | Traceback (most recent call last):
2024-09-26 13:18:11,024 wrapper ERROR | File "/home/me/cryosparc_master/cryosparc_command/commandcommon.py", line 196, in wrapper
2024-09-26 13:18:11,024 wrapper ERROR | res = func(*args, **kwargs)
2024-09-26 13:18:11,024 wrapper ERROR | File "/home/me/cryosparc_master/cryosparc_command/command_core/__init__.py", line 1230, in set_user_viewed_workspace
2024-09-26 13:18:11,024 wrapper ERROR | update_workspace(project_uid, workspace_uid, {'last_accessed' : {'n:ame' : get_username_by_id(user_id), 'accessed_at' : datetime.datetime.utcnow()}}, operation='$set', export=False)
2024-09-26 13:18:11,024 wrapper ERROR | File "/home/me/cryosparc_master/cryosparc_command/commandcommon.py", line 187, in wrapper
2024-09-26 13:18:11,024 wrapper ERROR | return func(*args, **kwargs)
2024-09-26 13:18:11,024 wrapper ERROR | File "/home/me/cryosparc_master/cryosparc_command/commandcommon.py", line 250, in wrapper
2024-09-26 13:18:11,024 wrapper ERROR | assert os.path.isfile(
2024-09-26 13:18:11,024 wrapper ERROR | AssertionError: validation error: lock file for P1 at /data/cryosparc/CS-andy-aug/cs.lock absent or otherwise inaccessible.
2024-09-26 13:18:11,714 wrapper ERROR | JSONRPC ERROR at set_user_viewed_job
2024-09-26 13:18:11,714 wrapper ERROR | Traceback (most recent call last):
2024-09-26 13:18:11,714 wrapper ERROR | File "/home/me/cryosparc_master/cryosparc_command/commandcommon.py", line 196, in wrapper
2024-09-26 13:18:11,714 wrapper ERROR | res = func(*args, **kwargs)
2024-09-26 13:18:11,714 wrapper ERROR | File "/home/me/cryosparc_master/cryosparc_command/command_core/__init__.py", line 1287, in set_user_viewed_job
2024-09-26 13:18:11,714 wrapper ERROR | update_job(project_uid, job_uid, {'last_accessed' : {'name' : get_username_by_id(user_id), 'accessed_at' : datetime.datetime.utcnow()}})
2024-09-26 13:18:11,714 wrapper ERROR | File "/home/me/cryosparc_master/cryosparc_command/commandcommon.py", line 187, in wrapper
2024-09-26 13:18:11,714 wrapper ERROR | return func(*args, **kwargs)
Here are the cmd results requested in the other thread:
cryosparcm cli "get_scheduler_targets()"
[{'cache_path': '/opt/cryosparc_cache', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 25425608704, 'name': 'NVIDIA RTX A5000'}, {'id': 1, 'mem': 25425608704, 'name': 'NVIDIA RTX A5000'}, {'id': 2, 'mem': 25417023488, 'name': 'NVIDIA RTX A5000'}, {'id': 3, 'mem': 25425608704, 'name': 'NVIDIA RTX A5000'}], 'hostname': 'sn4622115977', 'lane': 'default', 'monitor_port': None, 'name': 'sn4622115977', 'resource_fixed': {'SSD': True}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0, 1, 2, 3], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, 'ssh_str': 'me@sn4622115977', 'title': 'Worker node sn4622115977', 'type': 'node', 'worker_bin_path': '/opt/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 25425608704, 'name': 'NVIDIA RTX A5000'}, {'id': 1, 'mem': 25425608704, 'name': 'NVIDIA RTX A5000'}, {'id': 2, 'mem': 25417023488, 'name': 'NVIDIA RTX A5000'}, {'id': 3, 'mem': 25425608704, 'name': 'NVIDIA RTX A5000'}], 'hostname': 'ourdomain.edu', 'lane': 'default', 'monitor_port': None, 'name': 'ourdomain.edu', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0, 1, 2, 3], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, 'ssh_str': 'me@ourdomain.edu', 'title': 'Worker node ourdomain.edu', 'type': 'node', 'worker_bin_path': '/opt/cryosparc_worker/bin/cryosparcw'}]
(base) me@sn4622115977:~$ cryosparcm status | grep HOSTNAME
export CRYOSPARC_MASTER_HOSTNAME="sn4622115977"
(base) me@sn4622115977:~$ cryosparcm status
----------------------------------------------------------------------------
CryoSPARC System master node installed at
/home/me/cryosparc_master
Current cryoSPARC version: v4.5.3
----------------------------------------------------------------------------
CryoSPARC process status:
app RUNNING pid 5513, uptime 2:21:42
app_api RUNNING pid 5533, uptime 2:21:41
app_api_dev STOPPED Not started
command_core RUNNING pid 5425, uptime 2:21:55
command_rtp RUNNING pid 5489, uptime 2:21:48
command_vis RUNNING pid 5460, uptime 2:21:49
database RUNNING pid 5318, uptime 2:21:59
----------------------------------------------------------------------------
License is valid
----------------------------------------------------------------------------
global config variables:
export CRYOSPARC_LICENSE_ID="bae9edd6-54dd-11ef-93a3-7b0d1eadc7e2"
export CRYOSPARC_MASTER_HOSTNAME="sn4622115977"
export CRYOSPARC_DB_PATH="/home/me/cryosparc_database"
export CRYOSPARC_BASE_PORT=39000
export CRYOSPARC_DB_CONNECTION_TIMEOUT_MS=20000
export CRYOSPARC_INSECURE=false
export CRYOSPARC_DB_ENABLE_AUTH=true
export CRYOSPARC_CLUSTER_JOB_MONITOR_INTERVAL=10
export CRYOSPARC_CLUSTER_JOB_MONITOR_MAX_RETRIES=1000000
export CRYOSPARC_PROJECT_DIR_PREFIX='CS-'
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_CLICK_WRAP=true
export CRYOSPARC_SSD_CACHE_LIFETIME_DAYS=10