Ssh_askpass error after update

Hi,
I have change my os (centos to rocky9) and then update CS to 4.6.2 without errors.
CS install is not on the same file-system as the OS so after finish I restart CS and it start OK.

When I try to run job I got this error:

License is valid.
Launching job on lane default target mamba …
Running job on remote worker node hostname mamba
Failed to launch! 255
ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory
Permission denied, please try again.
ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory
Permission denied, please try again.
ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory
userxx@mamba: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password

I have try to rerun “cryosparcw connect” without error but got the same error after running job.

Best,

@Elad Please can you output the outputs of these commands:

cryosparcm cli "get_scheduler_targets()"
cryosparcm status | grep HOSTNAME
hostname -f
host $(hostname -f)

cryosparcm cli “get_scheduler_targets()”
[{‘cache_path’: ‘/data1/’, ‘cache_quota_mb’: None, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 1, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 2, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 3, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}], ‘hostname’: ‘mamba’, ‘lane’: ‘default’, ‘monitor_port’: None, ‘name’: ‘mamba’, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], ‘GPU’: [0, 1, 2, 3], ‘RAM’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]}, ‘ssh_str’: ‘uid@mamba’, ‘title’: ‘Worker node mamba’, ‘type’: ‘node’, ‘worker_bin_path’: ‘/data2/software/cryosparc/cryosparc_worker/bin/cryosparcw’}, {‘cache_path’: ‘/scr/’, ‘cache_quota_mb’: 800000, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 1, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 2, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 3, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}], ‘hostname’: ‘viper.xxx.vanderbilt.edu’, ‘lane’: ‘viper’, ‘monitor_port’: None, ‘name’: ‘viper.xxx.vanderbilt.edu’, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95], ‘GPU’: [0, 1, 2, 3], ‘RAM’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]}, ‘ssh_str’: ‘uid@viper.xxx.vanderbilt.edu’, ‘title’: ‘Worker node viper.xxx.vanderbilt.edu’, ‘type’: ‘node’, ‘worker_bin_path’: ‘/scr/software/cryosparc_worker/bin/cryosparcw’}, {‘cache_path’: ‘/data1/’, ‘cache_quota_mb’: None, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 1, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 2, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 3, ‘mem’: 11539054592, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}], ‘hostname’: ‘mamba.xxx.vanderbilt.edu’, ‘lane’: ‘default’, ‘monitor_port’: None, ‘name’: ‘mamba.xxx.vanderbilt.edu’, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], ‘GPU’: [0, 1, 2, 3], ‘RAM’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]}, ‘ssh_str’: ‘uid@mamba.xxx.vanderbilt.edu’, ‘title’: ‘Worker node mamba.xxx.vanderbilt.edu’, ‘type’: ‘node’, ‘worker_bin_path’: ‘/data2/software/cryosparc/cryosparc_worker/bin/cryosparcw’}]

cryosparcm status | grep HOSTNAME
export CRYOSPARC_MASTER_HOSTNAME=“mamba.xxx.vanderbilt.edu”

hostname -f
mamba.xxx.vanderbilt.edu

host $(hostname -f)
mamba.xxx.vanderbilt.edu has address 10.141.xx.244
mamba.xxx.vanderbilt.edu has address 10.94.xx.39

Thanks @Elad. If I am counting correctly, the scheduler target list comprises 3 entries, with apparent duplicate entries for the mamba server. My I ask:

  1. Is the viper server still in use and physically separate from mamba?
  2. Has the OS of viper also been upgraded?
  3. Do you know for what reason and by which mechanism the mamba host is associated with two IP addresses? What is the output of the command (on the CryoSPARC master host):
    ip a | grep '10.'
  1. yes, viper is in use and separate.
  2. yes
  3. I think VLAN

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
2: eno1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
4: ens11f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
5: ens10f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.94.26.39/23 brd 10.94.27.255 scope global dynamic noprefixroute ens10f0
6: ens11f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
7: ens10f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000

Ok it look likes I fix it.
As you say there was two mamba worker targets on the master (master/worker are the same physical node) and by run
cryosparcm cli “remove_scheduler_target_node (hostname= ‘mamba’)”

it fix the issue.

Thanks,

2 Likes