How do I reinstall an update to the worker

satyshur · February 27, 2023, 11:46pm

I updated the master to 4.2. Then the auto update of the worker failed. I noticed default cuda is 12.0. I changed to 11.5. Now if I try to update the worker it says it is up to date, but it failed.

                                      ^
  error: command '/usr/bin/gcc' failed with exit code 1
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure on pycuda. Can I delete the worked and install it fresh?

Ã Encountered error while trying to install package.
â°â> pycuda

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
check_install_deps.sh: 66: ERROR: installing python failed.

satyshur · February 27, 2023, 11:47pm

Can I delete the worker and install fresh?

satyshur · February 27, 2023, 11:51pm

[cryosparc_user@xingcryoem2 cryosparc_user] cd cryosparc2_worker/ [cryosparc_user@xingcryoem2 cryosparc2_worker] bin/cryosparcw gpulist
Traceback (most recent call last):
File “”, line 1, in
File “/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/connect.py”, line 23, in print_gpu_list
import pycuda.driver as cudrv
ModuleNotFoundError: No module named ‘pycuda’

satyshur · February 27, 2023, 11:54pm

The latest driver install uses 12.0. How do I tell pycuda to use the 11.5 version installed.

NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0
drwxr-xr-x 16 root root 290 Dec 30 2019 cuda-10.0
drwxr-xr-x 15 root root 265 Dec 30 2019 cuda-10.1
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.3
drwxr-xr-x 16 root root 278 Dec 30 2021 cuda-10.2
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.1
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.2
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.0
lrwxrwxrwx 1 root root 25 Dec 30 2021 cuda-11 → /etc/alternatives/cuda-11
drwxr-xr-x 3 root root 17 Jan 27 2022 cuda-11.4
drwxr-xr-x 2 root root 84 Feb 25 06:30 bin
drwxr-xr-x 16 root root 4.0K Feb 25 06:34 cuda-11.5
drwxr-xr-x 15 root root 4.0K Feb 25 06:34 cuda-12.0
lrwxrwxrwx 1 root root 22 Feb 25 06:35 cuda → /etc/alternatives/cuda
lrwxrwxrwx 1 root root 25 Feb 25 06:35 cuda-12 → /etc/alternatives/cuda-12

wtempel · February 28, 2023, 2:53pm

What is the output of

grep CRYOSPARC_CUDA_PATH /mnt/ssd/cryosparc_user/cryosparc2_worker/config.sh

satyshur · February 28, 2023, 8:42pm

[cryosparc_user@xingcryoem2 cryosparc2_worker]$ grep CRYOSPARC_CUDA_PATH /mnt/ssd/cryosparc_user/cryosparc2_worker/config.sh
export CRYOSPARC_CUDA_PATH=“/usr/local/cuda”

wtempel · February 28, 2023, 9:38pm

If your CryoSPARC instance is currently not functional due to a broken worker configuration, I recommend:

cryosparcm update (should update the master to CryoSPARC v4.2, released yesterday).
identify or create a CUDA 11 toolkit installation that is independent from system updates, redefined symbolic links or alternatives. For the following steps, let’s assume such an installation exists at
/usr/local/cuda-11.5.
move your existing cryosparc_worker installation “out of the way”:
mv cryosparc_worker cryosparc_worker_obsolete_20230228
if a new cryosparc_worker package wasn’t downloaded as part of cryosparcm update, download it.
unpack the cryosparc_worker archive such that it occupies the original path of the cryosparc_worker directory that you just moved.

run

cd cryosparc_worker
./install.sh --cudapath /usr/local/cuda-11.5 --license "<your-license-id>"

Did this work?

satyshur · February 28, 2023, 10:01pm

yes.

******* CRYOSPARC WORKER INSTALLATION COMPLETE *******************

In order to run processing jobs, you will need to connect this
worker to a cryoSPARC master.

But how do I connect the worker to the master now?
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ ./bin/cryosparcw connect --worker xingcryoem2.oncology.wisc.edu --master xingcryoem2.oncology.wisc.edu --port 39000

CRYOSPARC CONNECT --------------------------------------------

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002
Connecting as unix user cryosparc_user
Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu
If this is incorrect, you should re-run this command with the flag --sshstr

Traceback (most recent call last):
File “bin/connect.py”, line 76, in
assert cli.test_connection(), “Could not connect to cryoSPARC master at %s:%d” % (master_hostname, command_core_port)
File “/mnt/ssd/cryosparc_user/cryosparc2_worker/cryosparc_tools/cryosparc/command.py”, line 112, in func
assert “error” not in res, f’Error for “{key}” with params {params}:\n’ + format_server_error(res[“error”])
AssertionError: Error for “test_connection” with params ():
ServerError: Authentication failed - License-ID mismatch.
Please ensure cryosparc_master/config.sh and cryoparc_worker/config.sh have the same CRYOSPARC_LICENSE_ID entry
or CRYOSPARC_LICENSE_ID is set correctly in the current environment.
See CryoSPARC Architecture and System Requirements - CryoSPARC Guide for more details.

And I still get no GPU when running cryosparc.

satyshur · February 28, 2023, 10:05pm

sorry this is the correct one

./bin/cryosparcw connect --worker xingcryoem2.oncology.wisc.edu --master xingcryoem2.oncology.wisc.edu --port 39000

CRYOSPARC CONNECT --------------------------------------------

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002
Connecting as unix user cryosparc_user
Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu
If this is incorrect, you should re-run this command with the flag --sshstr

Connected to master.

Current connected workers:
xingcryoem2.oncology.wisc.edu

Autodetecting available GPUs…
Detected 4 CUDA devices.

id pci-bus name

   0      0000:1B:00.0  NVIDIA GeForce RTX 2080 Ti
   1      0000:3E:00.0  NVIDIA GeForce RTX 2080 Ti
   2      0000:88:00.0  NVIDIA GeForce RTX 2080 Ti
   3      0000:B1:00.0  NVIDIA GeForce RTX 2080 Ti

All devices will be enabled now.
This can be changed later using --update

Traceback (most recent call last):
File “bin/connect.py”, line 225, in
assert args.ssdpath is not None or args.nossd, “Either provide --ssdpath or --nossd”
AssertionError: Either provide --ssdpath or --nossd

satyshur · February 28, 2023, 10:17pm

When I run patched motion core I get

No GPU available.

wtempel · February 28, 2023, 10:44pm

A few questions:

Are master and worker supposed to run on the same computer?
Do you intend to user more than one worker computer?
What is the output of
cryosparcm cli "get_scheduler_targets()"?
Do you have an ssd device that you wish to use for particle caching?

satyshur · February 28, 2023, 10:55pm

Yes. master is the worker.
2.No, only one worker and it is the master with 4 GPU.
3.[cryosparc_user@xingcryoem2 cryosparc2_master]$ cryosparcm cli “get_scheduler_targets()”
[{‘cache_path’: ‘/mnt/ssd’, ‘cache_quota_mb’: None, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 1, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 2, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 3, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}], ‘hostname’: ‘xingcryoem2.oncology.wisc.edu’, ‘lane’: ‘default’, ‘monitor_port’: None, ‘name’: ‘xingcryoem2.oncology.wisc.edu’, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], ‘GPU’: [0, 1, 2, 3], ‘RAM’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]}, ‘ssh_str’: ‘cryosparc_user@xingcryoem2.oncology.wisc.edu’, ‘title’: ‘Worker node xingcryoem2.oncology.wisc.edu’, ‘type’: ‘node’, ‘worker_bin_path’: ‘/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw’}]
yes, /mnt/ssd/

wtempel · February 28, 2023, 11:15pm

What are the outputs of

stat /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw
cat /mnt/ssd/cryosparc_user/cryosparc2_worker/version

satyshur · February 28, 2023, 11:20pm

[cryosparc_user@xingcryoem2 cryosparc2_worker] stat /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw File: ‘/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw’ Size: 17575 Blocks: 40 IO Block: 4096 regular file Device: fd02h/64770d Inode: 10742089454 Links: 1 Access: (0775/-rwxrwxr-x) Uid: ( 1001/cryosparc_user) Gid: ( 1001/cryosparc_user) Access: 2023-02-28 17:10:55.063198015 -0600 Modify: 2023-02-27 09:17:33.000000000 -0600 Change: 2023-02-27 17:08:52.609999192 -0600 Birth: - [cryosparc_user@xingcryoem2 cryosparc2_worker] cat /mnt/ssd/cryosparc_user/cryosparc2_worker/version
v4.2.0
[cryosparc_user@xingcryoem2 cryosparc2_worker]$

satyshur · February 28, 2023, 11:21pm

[cryosparc_user@xingcryoem2 cryosparc2_worker]$ stat /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw
File: ‘/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw’
Size: 17575 Blocks: 40 IO Block: 4096 regular file
Device: fd02h/64770d Inode: 10742089454 Links: 1
Access: (0775/-rwxrwxr-x) Uid: ( 1001/cryosparc_user) Gid: ( 1001/cryosparc_user)
Access: 2023-02-28 17:10:55.063198015 -0600
Modify: 2023-02-27 09:17:33.000000000 -0600
Change: 2023-02-27 17:08:52.609999192 -0600
Birth: -

satyshur · February 28, 2023, 11:27pm

[cryosparc_user@xingcryoem2 cryosparc2_master]$ cryosparcm status

CryoSPARC System master node installed at
/mnt/ssd/cryosparc_user/cryosparc2_master
Current cryoSPARC version: v4.2.0

CryoSPARC process status:

app RUNNING pid 107728, uptime 1:18:22
app_api RUNNING pid 107751, uptime 1:18:20
app_api_dev STOPPED Not started
app_legacy STOPPED Not started
app_legacy_dev STOPPED Not started
command_core RUNNING pid 107627, uptime 1:18:36
command_rtp RUNNING pid 107690, uptime 1:18:27
command_vis RUNNING pid 107656, uptime 1:18:29
database RUNNING pid 107520, uptime 1:18:40

License is valid

global config variables:
export CRYOSPARC_LICENSE_ID=“XXXXXXXXXXXX”
export CRYOSPARC_MASTER_HOSTNAME=“xingcryoem2.oncology.wisc.edu”
export CRYOSPARC_DB_PATH=“/mnt/ssd/cryosparc_user/cryosparc2_database”
export CRYOSPARC_BASE_PORT=39000
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_INSECURE=false
export CRYOSPARC_HEARTBEAT_SECONDS=180
export CRYOSPARC_DISABLE_IMPORT_ON_MASTER=false

wtempel · February 28, 2023, 11:36pm

Did you try any GPU jobs with the current state of the CryoSPARC instance?

satyshur · March 1, 2023, 6:10pm

We tried the patch motion cor and that is where we get ‘GPU not available’.

wtempel · March 1, 2023, 8:00pm

Please can can you post the output of these commands:

nvidia-smi
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call which nvcc
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call nvcc --version
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call python -c "import pycuda.driver; print(pycuda.driver.get_version())"
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw gpulist

satyshur · March 1, 2023, 10:01pm

[cryosparc_user@xingcryoem2 cryosparc2_worker]$ pwd
/mnt/ssd/cryosparc_user/cryosparc2_worker
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call which nvcc
/usr/local/cuda-11/bin/nvcc
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call python -c "import pycuda.driver; print(pycuda.driver.get_version())"
(11, 5, 0)
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw gpulist
  Detected 4 CUDA devices.

   id           pci-bus  name
   ---------------------------------------------------------------
       0      0000:1B:00.0  NVIDIA GeForce RTX 2080 Ti
       1      0000:3E:00.0  NVIDIA GeForce RTX 2080 Ti
       2      0000:88:00.0  NVIDIA GeForce RTX 2080 Ti
       3      0000:B1:00.0  NVIDIA GeForce RTX 2080 Ti
   ---------------------------------------------------------------
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ nvidia-smi

How do I reinstall an update to the worker

But how do I connect the worker to the master now? [cryosparc_user@xingcryoem2 cryosparc2_worker]$ ./bin/cryosparcw connect --worker xingcryoem2.oncology.wisc.edu --master xingcryoem2.oncology.wisc.edu --port 39000

CRYOSPARC CONNECT --------------------------------------------

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002 Connecting as unix user cryosparc_user Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu If this is incorrect, you should re-run this command with the flag --sshstr

./bin/cryosparcw connect --worker xingcryoem2.oncology.wisc.edu --master xingcryoem2.oncology.wisc.edu --port 39000

CRYOSPARC CONNECT --------------------------------------------

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002 Connecting as unix user cryosparc_user Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu If this is incorrect, you should re-run this command with the flag --sshstr

Connected to master.

Current connected workers: xingcryoem2.oncology.wisc.edu

id pci-bus name

All devices will be enabled now. This can be changed later using --update

[cryosparc_user@xingcryoem2 cryosparc2_master]$ cryosparcm status

CryoSPARC System master node installed at /mnt/ssd/cryosparc_user/cryosparc2_master Current cryoSPARC version: v4.2.0

License is valid

But how do I connect the worker to the master now?
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ ./bin/cryosparcw connect --worker xingcryoem2.oncology.wisc.edu --master xingcryoem2.oncology.wisc.edu --port 39000

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002
Connecting as unix user cryosparc_user
Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu
If this is incorrect, you should re-run this command with the flag --sshstr

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002
Connecting as unix user cryosparc_user
Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu
If this is incorrect, you should re-run this command with the flag --sshstr

Current connected workers:
xingcryoem2.oncology.wisc.edu

All devices will be enabled now.
This can be changed later using --update

CryoSPARC System master node installed at
/mnt/ssd/cryosparc_user/cryosparc2_master
Current cryoSPARC version: v4.2.0