Your account expired error

haomingz · April 10, 2025, 5:01pm

Hello CS team,

I got an error message (see below) when launching a job in CS 4.6.2.

License is valid.

Launching job on lane default target localhost …

Running job on remote worker node hostname localhost

Failed to launch! 255

Your account has expired; please contact your system administrator
Connection closed by ::1 port 22

I have contacted our IT and was told this had nothing to do our IT but related to CS issues. I would greatly appreciate it you could help with the troubleshoot. Thank you!

wtempel · April 10, 2025, 5:58pm

Please can you post the outputs of these commands:

ls -l $(which cryosparcm)
cryosparcm cli "get_scheduler_targets()"
cryosparcm status | grep HOST
hostname -f
host $(hostname -f)
host localhost
cat /etc/hosts

haomingz · April 10, 2025, 6:12pm

Thanks for your response. The outputs are as follows:

cryosparc@RDLR0027 ~]$ ls -l $(which cryosparcm)
-rwxr-xr-x. 1 cryosparc cryosparc 76852 Nov 18 10:19 /app/apps/rhel8/cryosparc/cryosparc_master/bin/cryosparcm

[cryosparc@RDLR0027 ~]$ cryosparcm cli "get_scheduler_targets()"
[{'cache_path': '/data2/cryosparc_cache', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 50943623168, 'name': 'Quadro RTX 8000'}, {'id': 1, 'mem': 50946506752, 'name': 'Quadro RTX 8000'}, {'id': 2, 'mem': 50946506752, 'name': 'Quadro RTX 8000'}], 'hostname': 'localhost', 'lane': 'default', 'monitor_port': None, 'name': 'localhost', 'resource_fixed': {'SSD': True}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95], 'GPU': [0, 1, 2, 3], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]}, 'ssh_str': 'cryosparc@localhost', 'title': 'Worker node localhost', 'type': 'node', 'worker_bin_path': '/app/apps/rhel8/cryosparc/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 50943623168, 'name': 'Quadro RTX 8000'}, {'id': 1, 'mem': 50946768896, 'name': 'Quadro RTX 8000'}, {'id': 2, 'mem': 50946768896, 'name': 'Quadro RTX 8000'}], 'hostname': 'RDLR0027.ddns.med.umich.edu', 'lane': 'default', 'monitor_port': None, 'name': 'RDLR0027.ddns.med.umich.edu', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95], 'GPU': [0, 1, 2], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]}, 'ssh_str': 'root@RDLR0027.ddns.med.umich.edu', 'title': 'Worker node RDLR0027.ddns.med.umich.edu', 'type': 'node', 'worker_bin_path': '/home/cryosparc/cryosparc_worker/bin/cryosparcw'}]

[cryosparc@RDLR0027 ~]$ cryosparcm status | grep HOST
export CRYOSPARC_MASTER_HOSTNAME="RDLR0027.ddns.med.umich.edu"
[cryosparc@RDLR0027 ~]$ hostname -f
RDLR0027.ddns.med.umich.edu
[cryosparc@RDLR0027 ~]$ host $(hostname -f)
RDLR0027.ddns.med.umich.edu has address 172.17.176.141
[cryosparc@RDLR0027 ~]$ host localhost
localhost.ddns.med.umich.edu has address 10.60.122.197
[cryosparc@RDLR0027 ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
[cryosparc@RDLR0027 ~]$

haomingz · April 11, 2025, 4:18pm

HI Wtempel, I wonder if you have had a chance to look into the outputs I sent yesterday. We can’t run any jobs right now and would greatly appreciate your help to troubleshoot. Thanks.

wtempel · April 11, 2025, 5:54pm

@haomingz May I ask:

Two workers, RDLR0027.ddns.med.umich.edu and localhost, are registered on this CryoSPARC instance. Do these refer to the same physical computer?
What is the output of the command ip a ?
Is your network configured such that
- the server will retain the RDLR0027.ddns.med.umich.edu host name after each reboot
- any attempt from this or another computer to connect to RDLR0027.ddns.med.umich.edu will point to this computer, now and in the future?

haomingz · April 11, 2025, 6:34pm

Yes they refer to the same physical computer. But I don’t know why there two workers.

The output of ip a is:

[haom@RDLR0027 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ac:1f:6b:a1:25:56 brd ff:ff:ff:ff:ff:ff
    altname enp1s0f0
    inet 172.17.176.141/26 brd 172.17.176.191 scope global dynamic noprefixroute eno1
       valid_lft 41089sec preferred_lft 41089sec
    inet6 fe80::ae1f:6bff:fea1:2556/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether ac:1f:6b:a1:25:57 brd ff:ff:ff:ff:ff:ff
    altname enp1s0f1
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:29:c5:b2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
[haom@RDLR0027 ~]$

yes. The server will retain the RDLR0027.ddns.med.umich.edu host name after eacdh reboot. any attempt from any remote computer will point to this computer noe and in the future.

Thanks for your response,

wtempel · April 11, 2025, 7:49pm

@haomingz It is possible that the server was originally set up using a localhost worker, but this arrangement is no longer compatible with your network configuration. Someone may have later run the cryosparcw connect command under the root account. Running cryosparcm or cryosparcw accounts as root should be avoided for important reasons (details).
In this case, you may want to (all commands as user cryosparc):

ensure master and worker versions match:

cat /app/apps/rhel8/cryosparc/cryosparc_master/version
cat /app/apps/rhel8/cryosparc/cryosparc_worker/version

determine the base port number of your CryoSPARC instance:

grep CRYOSPARC_BASE_PORT /app/apps/rhel8/cryosparc/cryosparc_master/config.sh

remove the current default scheduler lane (guide)

cryosparcm cli "remove_scheduler_lane('default')"

reconnect the worker, replacing 99999 with the base port determined earlier (guide)

/app/apps/rhel8/cryosparc/cryosparc_worker/bin/cryosparcw connect --worker RDLR0027.ddns.med.umich.edu --master RDLR0027.ddns.med.umich.edu --ssdpath /data2/cryosparc_cache --port 99999

Does this help?

haomingz · April 14, 2025, 1:54pm

It worked! Thank you very much. I greatly appreciate your help.