Issue restarting cryoSPARC after updating to v.4.3.1

Hello,

I just updated our cryoSPARC to v. 4.3.1. When I try to restart (i.e. cryosparcm restart), I get the following error:

I looked into the forms a bit to see if others were having issues, and was wondering if my issue was related to this thread. Therefore, I ran the suggested command to look for orphaned processes and this was the output:

78222 ?        Ss     0:00 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
 78871 ?        Ss     0:00 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
 81211 ?        Ss     0:00 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
 86162 ?        Ss     0:00 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
 87487 pts/2    S+     0:00 grep --color=auto -e cryosparc -e mongo
118382 ?        Ss    14:54 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
220020 ?        Ss    28:40 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
220157 ?        Sl   2542:22 mongod --auth --dbpath /home/cryosparc/cryosparc_database --port 39001 --oplogSize 64 --replSet meteor --nojournal --wiredTigerCacheSizeGB 4 --bind_ip_all
220292 ?        Sl   326:03 python -c import cryosparc_command.command_core as serv; serv.start(port=39002)
220347 ?        Sl   184:48 python -c import cryosparc_command.command_vis as serv; serv.start(port=39003)
220388 ?        Sl   581:22 python -c import cryosparc_command.command_rtp as serv; serv.start(port=39005)
220477 ?        Sl    31:43 /home/cryosparc/cryosparc_master/cryosparc_app/custom-server/nodejs/bin/node dist/server/index.js
220497 ?        Sl   221:11 /home/cryosparc/cryosparc_master/cryosparc_app/api/nodejs/bin/node ./bundle/main.js

Can anyone please advise how to move forward so that cryoSPARC can be restarted properly and we can begin using it again? Thanks in advance for your help!

Best,
Kyle

Welcome to the forum @KyleBarrie.

I suspect that runaway CryoSPARC processes from before the update disrupted the update and subsequent CryoSPARC startup(s).
To confirm this suspicion, please can you provide additional details:

  1. How many CryoSPARC instances are supposed to run on this computer?
  2. Please can you post the output of the command
    ps -eopid,ppid,start,cmd | grep -e cryosparc -e mongo
    

If the suspicion is correct, I propose

  1. Terminate supervisord processes that belong to your CryoSPARC instance. Please confirm that the process identifiers in fact belong to your, not someone else’s, CryoSPARC instance before running this command:
    kill 78222 78871 81211 86162 118382 220020
    
  2. Wait 10 seconds, then confirm, using above ps command, that all the python and mongod have been terminated.
  3. Try cryosparcm start
  4. If
    cat /path/to/cryosparc_worker/version
    
    shows a version older than 4.3.1 for any of your cryosparc_worker installation, manually update cryosparc_worker installation(s).

Thank you very much for the prompt response and for welcoming me to the forum.

To answer your first questions:

  1. Only one instance should be running on this computer.
  2. Here is the output:
 78222      1 18:05:24 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
 78871      1 18:06:42 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
 81211      1 18:10:35 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
 97833      1 18:55:33 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
118382      1   Aug 08 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
119312 119211 09:39:06 su cryosparc
119415 119323 09:39:20 grep --color=auto -e cryosparc -e mongo
220020      1   Jul 17 python /home/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /home/cryosparc/cryosparc_master/supervisord.conf
220157 220020   Jul 17 mongod --auth --dbpath /home/cryosparc/cryosparc_database --port 39001 --oplogSize 64 --replSet meteor --nojournal --wiredTigerCacheSizeGB 4 --bind_ip_all
220292 220020   Jul 17 python -c import cryosparc_command.command_core as serv; serv.start(port=39002)
220347 220020   Jul 17 python -c import cryosparc_command.command_vis as serv; serv.start(port=39003)
220388 220020   Jul 17 python -c import cryosparc_command.command_rtp as serv; serv.start(port=39005)
220477 220020   Jul 17 /home/cryosparc/cryosparc_master/cryosparc_app/custom-server/nodejs/bin/node dist/server/index.js
220497 220020   Jul 17 /home/cryosparc/cryosparc_master/cryosparc_app/api/nodejs/bin/node ./bundle/main.js

Please let me know if I should proceed with killing the processes. Thanks!

Best,
Kyle

I’d add process 97833 to the list and go ahead, even though I cannot quite explain process 118382, which suggests that CryoSPARC may have run in a degraded state since at least August 8.

Hi wtempel,

I went ahead and killed the processes and manually updated my cryosparc worker version as suggested. This successfully allowed me to re-enter the online GUI and launch new jobs to run in v.4.3.1, thanks for your help!

However, I am running into a different issue now. When I try to run a job, it is launched but never actually begins running. Any ideas for why this would be?

Best,
Kyle

Please can you provide the following details:

  1. The final lines of the launched job’s Event Log
  2. On the CryoSPARC master host, output of the commands
    grep HOSTNAME /home/cryosparc/cryosparc_master/config.sh
    id cryosparc
    /home/cryosparc/cryosparc_master/bin/cryosparcm cli "get_scheduler_targets()"
    
  3. On the worker node, please can you run and post the output of the ls command:
    cd /path/to/job/directory
    ls -al
    
  4. Do /path/to/job/directory match exactly between CryoSPARC master and worker (in case master and worker are not on the same computer)?

Hi wtempel,

To answer your questions:

  1. The final line is Running job on master node hostname localhost
  2. Output for first command:
export CRYOSPARC_MASTER_HOSTNAME="localhost"
export CRYOSPARC_FORCE_HOSTNAME=true

Second command:

uid=1001(cryosparc) gid=1001(cryosparc) groups=1001(cryosparc),10(wheel)

Third command:

[{'cache_path': '/data', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 11554717696, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 1, 'mem': 11554717696, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 2, 'mem': 11554717696, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 3, 'mem': 11554324480, 'name': 'NVIDIA GeForce RTX 2080 Ti'}], 'hostname': 'localhost', 'lane': 'default', 'monitor_port': None, 'name': 'localhost', 'resource_fixed': {'SSD': True}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0, 1, 2, 3], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]}, 'ssh_str': 'cryosparc@localhost', 'title': 'Worker node localhost', 'type': 'node', 'worker_bin_path': '/home/cryosparc/cryosparc_worker/bin/cryosparcw'}]
  1. list of the directory contents for the job I’m trying to run:
total 120
drwxrwxr-x.   3 cryosparc cryosparc  4096 Sep 12 10:29 .
drwxrwxr-x. 383 cryosparc cryosparc 12288 Sep 12 10:14 ..
-rw-rw-r--.   1 cryosparc cryosparc    18 Sep 12 10:29 events.bson
drwxrwxr-x.   2 cryosparc cryosparc  4096 Sep 12 10:29 gridfs_data
-rw-rw-r--.   1 cryosparc cryosparc 90521 Sep 12 10:29 job.json
-rw-rw-r--.   1 cryosparc cryosparc  2635 Sep 12 10:29 job.log
  1. Master and worker are on the same computer.

Thanks!
Kyle

Please can you
cat job.log
and post the output.

Hi wtempel,

Here is the output:

================= CRYOSPARCW =======  2023-09-12 10:29:39.843974  =========
Project P28 Job J381
Master localhost Port 39002
===========================================================================
========= monitor process now starting main process
MAINPROCESS PID 138187
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "cryosparc_worker/cryosparc_compute/run.py", line 161, in cryosparc_compute.run.run
  File "/home/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 94, in connect
    assert cli.test_connection(), "Job could not connect to master instance at %s:%s" % (master_hostname, str(master_command_core_port))
  File "/home/cryosparc/cryosparc_worker/cryosparc_compute/client.py", line 62, in func
    assert False, res['error']
AssertionError: {'code': 403, 'data': None, 'message': 'ServerError: Authentication failed - License-ID request header missing.\n   This may indicate that cryosparc_worker did not update,\n   cryosparc_worker/config.sh is missing a CRYOSPARC_LICENSE_ID entry,\n   or CRYOSPARC_LICENSE_ID is not present in your environment.\n   See https://guide.cryosparc.com/setup-configuration-and-management/hardware-and-system-requirements#command-api-security for more details.\n', 'name': 'ServerError'}
Process Process-1:
Traceback (most recent call last):
  File "/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "cryosparc_worker/cryosparc_compute/run.py", line 31, in cryosparc_compute.run.main
  File "/home/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 94, in connect
    assert cli.test_connection(), "Job could not connect to master instance at %s:%s" % (master_hostname, str(master_command_core_port))
  File "/home/cryosparc/cryosparc_worker/cryosparc_compute/client.py", line 62, in func
    assert False, res['error']
AssertionError: {'code': 403, 'data': None, 'message': 'ServerError: Authentication failed - License-ID request header missing.\n   This may indicate that cryosparc_worker did not update,\n   cryosparc_worker/config.sh is missing a CRYOSPARC_LICENSE_ID entry,\n   or CRYOSPARC_LICENSE_ID is not present in your environment.\n   See https://guide.cryosparc.com/setup-configuration-and-management/hardware-and-system-requirements#command-api-security for more details.\n', 'name': 'ServerError'}

Hi wtempel,

Just wanted to check in and ask if you have any troubleshooting suggestions following my prior post. Thanks again for your help!

Best,
Kyle

Please can you confirm that cryosparc_worker/config.sh contains a definition
export CRYOSPARC_LICENSE_ID=
definition that matches the definition in your instance’s
cryosparc_master/config.sh

Hi wtempel,

Thanks for your message. The license IDs in config.sh match for the worker and master:

Master:

export CRYOSPARC_LICENSE_ID="redacted"

Worker:

export CRYOSPARC_LICENSE_ID="redacted"

Best,
Kyle

A member of our team found that that the line numbers associated with the errors did not match those expected for CryoSPARC v4.3.1 worker files. Please can you confirm that

  1. cat /home/cryosparc/cryosparc_worker/version shows 4.3.1
  2. you followed this procedure, including running the command
    cryosparcw update, not merely edited the cryosparc_worker/version file.

Hi wtempel,

Thank you for your response. I do remember checking this but I just checked again and you’re right, the version was 3.2.0 for some reason (I updated to 4.3.1 from 4.0.0, maybe I was using an old version of the .tar file…). Just re-followed the manual update instructions with the .tar file /home/cryosparc/cryosparc_master and everything is back up and running! Thanks so much for the help from you and the team!

Best,
Kyle

1 Like