How do I reinstall an update to the worker

I updated the master to 4.2. Then the auto update of the worker failed. I noticed default cuda is 12.0. I changed to 11.5. Now if I try to update the worker it says it is up to date, but it failed.

                                      ^
  error: command '/usr/bin/gcc' failed with exit code 1
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure on pycuda. Can I delete the worked and install it fresh?

× Encountered error while trying to install package.
╰─> pycuda

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
check_install_deps.sh: 66: ERROR: installing python failed.

Can I delete the worker and install fresh?

[cryosparc_user@xingcryoem2 cryosparc_user] cd cryosparc2_worker/ [cryosparc_user@xingcryoem2 cryosparc2_worker] bin/cryosparcw gpulist
Traceback (most recent call last):
File “”, line 1, in
File “/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/connect.py”, line 23, in print_gpu_list
import pycuda.driver as cudrv
ModuleNotFoundError: No module named ‘pycuda’

The latest driver install uses 12.0. How do I tell pycuda to use the 11.5 version installed.

NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0
drwxr-xr-x 16 root root 290 Dec 30 2019 cuda-10.0
drwxr-xr-x 15 root root 265 Dec 30 2019 cuda-10.1
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.3
drwxr-xr-x 16 root root 278 Dec 30 2021 cuda-10.2
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.1
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.2
drwxr-xr-x 3 root root 17 Dec 30 2021 cuda-11.0
lrwxrwxrwx 1 root root 25 Dec 30 2021 cuda-11 → /etc/alternatives/cuda-11
drwxr-xr-x 3 root root 17 Jan 27 2022 cuda-11.4
drwxr-xr-x 2 root root 84 Feb 25 06:30 bin
drwxr-xr-x 16 root root 4.0K Feb 25 06:34 cuda-11.5
drwxr-xr-x 15 root root 4.0K Feb 25 06:34 cuda-12.0
lrwxrwxrwx 1 root root 22 Feb 25 06:35 cuda → /etc/alternatives/cuda
lrwxrwxrwx 1 root root 25 Feb 25 06:35 cuda-12 → /etc/alternatives/cuda-12

What is the output of

grep CRYOSPARC_CUDA_PATH /mnt/ssd/cryosparc_user/cryosparc2_worker/config.sh

[cryosparc_user@xingcryoem2 cryosparc2_worker]$ grep CRYOSPARC_CUDA_PATH /mnt/ssd/cryosparc_user/cryosparc2_worker/config.sh
export CRYOSPARC_CUDA_PATH=“/usr/local/cuda”

If your CryoSPARC instance is currently not functional due to a broken worker configuration, I recommend:

  1. cryosparcm update (should update the master to CryoSPARC v4.2, released yesterday).
  2. identify or create a CUDA 11 toolkit installation that is independent from system updates, redefined symbolic links or alternatives. For the following steps, let’s assume such an installation exists at
    /usr/local/cuda-11.5.
  3. move your existing cryosparc_worker installation “out of the way”:
    mv cryosparc_worker cryosparc_worker_obsolete_20230228
  4. if a new cryosparc_worker package wasn’t downloaded as part of cryosparcm update, download it.
  5. unpack the cryosparc_worker archive such that it occupies the original path of the cryosparc_worker directory that you just moved.
  6. run
    cd cryosparc_worker
    ./install.sh --cudapath /usr/local/cuda-11.5 --license "<your-license-id>"
    

Did this work?

yes.

******* CRYOSPARC WORKER INSTALLATION COMPLETE *******************

In order to run processing jobs, you will need to connect this
worker to a cryoSPARC master.


But how do I connect the worker to the master now?
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ ./bin/cryosparcw connect --worker xingcryoem2.oncology.wisc.edu --master xingcryoem2.oncology.wisc.edu --port 39000

CRYOSPARC CONNECT --------------------------------------------

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002
Connecting as unix user cryosparc_user
Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu
If this is incorrect, you should re-run this command with the flag --sshstr

Traceback (most recent call last):
File “bin/connect.py”, line 76, in
assert cli.test_connection(), “Could not connect to cryoSPARC master at %s:%d” % (master_hostname, command_core_port)
File “/mnt/ssd/cryosparc_user/cryosparc2_worker/cryosparc_tools/cryosparc/command.py”, line 112, in func
assert “error” not in res, f’Error for “{key}” with params {params}:\n’ + format_server_error(res[“error”])
AssertionError: Error for “test_connection” with params ():
ServerError: Authentication failed - License-ID mismatch.
Please ensure cryosparc_master/config.sh and cryoparc_worker/config.sh have the same CRYOSPARC_LICENSE_ID entry
or CRYOSPARC_LICENSE_ID is set correctly in the current environment.
See CryoSPARC Architecture and System Requirements - CryoSPARC Guide for more details.

And I still get no GPU when running cryosparc.

sorry this is the correct one

./bin/cryosparcw connect --worker xingcryoem2.oncology.wisc.edu --master xingcryoem2.oncology.wisc.edu --port 39000

CRYOSPARC CONNECT --------------------------------------------

Attempting to register worker xingcryoem2.oncology.wisc.edu to command xingcryoem2.oncology.wisc.edu:39002
Connecting as unix user cryosparc_user
Will register using ssh string: cryosparc_user@xingcryoem2.oncology.wisc.edu
If this is incorrect, you should re-run this command with the flag --sshstr

Connected to master.

Current connected workers:
xingcryoem2.oncology.wisc.edu

Autodetecting available GPUs…
Detected 4 CUDA devices.

id pci-bus name

   0      0000:1B:00.0  NVIDIA GeForce RTX 2080 Ti
   1      0000:3E:00.0  NVIDIA GeForce RTX 2080 Ti
   2      0000:88:00.0  NVIDIA GeForce RTX 2080 Ti
   3      0000:B1:00.0  NVIDIA GeForce RTX 2080 Ti

All devices will be enabled now.
This can be changed later using --update

Traceback (most recent call last):
File “bin/connect.py”, line 225, in
assert args.ssdpath is not None or args.nossd, “Either provide --ssdpath or --nossd”
AssertionError: Either provide --ssdpath or --nossd

When I run patched motion core I get

No GPU available.

A few questions:

  1. Are master and worker supposed to run on the same computer?
  2. Do you intend to user more than one worker computer?
  3. What is the output of
    cryosparcm cli "get_scheduler_targets()"?
  4. Do you have an ssd device that you wish to use for particle caching?
  1. Yes. master is the worker.
    2.No, only one worker and it is the master with 4 GPU.
    3.[cryosparc_user@xingcryoem2 cryosparc2_master]$ cryosparcm cli “get_scheduler_targets()”
    [{‘cache_path’: ‘/mnt/ssd’, ‘cache_quota_mb’: None, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 1, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 2, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}, {‘id’: 3, ‘mem’: 11546394624, ‘name’: ‘NVIDIA GeForce RTX 2080 Ti’}], ‘hostname’: ‘xingcryoem2.oncology.wisc.edu’, ‘lane’: ‘default’, ‘monitor_port’: None, ‘name’: ‘xingcryoem2.oncology.wisc.edu’, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], ‘GPU’: [0, 1, 2, 3], ‘RAM’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]}, ‘ssh_str’: ‘cryosparc_user@xingcryoem2.oncology.wisc.edu’, ‘title’: ‘Worker node xingcryoem2.oncology.wisc.edu’, ‘type’: ‘node’, ‘worker_bin_path’: ‘/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw’}]
  2. yes, /mnt/ssd/

What are the outputs of

stat /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw
cat /mnt/ssd/cryosparc_user/cryosparc2_worker/version

[cryosparc_user@xingcryoem2 cryosparc2_worker] stat /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw File: ‘/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw’ Size: 17575 Blocks: 40 IO Block: 4096 regular file Device: fd02h/64770d Inode: 10742089454 Links: 1 Access: (0775/-rwxrwxr-x) Uid: ( 1001/cryosparc_user) Gid: ( 1001/cryosparc_user) Access: 2023-02-28 17:10:55.063198015 -0600 Modify: 2023-02-27 09:17:33.000000000 -0600 Change: 2023-02-27 17:08:52.609999192 -0600 Birth: - [cryosparc_user@xingcryoem2 cryosparc2_worker] cat /mnt/ssd/cryosparc_user/cryosparc2_worker/version
v4.2.0
[cryosparc_user@xingcryoem2 cryosparc2_worker]$

[cryosparc_user@xingcryoem2 cryosparc2_worker]$ stat /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw
File: ‘/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw’
Size: 17575 Blocks: 40 IO Block: 4096 regular file
Device: fd02h/64770d Inode: 10742089454 Links: 1
Access: (0775/-rwxrwxr-x) Uid: ( 1001/cryosparc_user) Gid: ( 1001/cryosparc_user)
Access: 2023-02-28 17:10:55.063198015 -0600
Modify: 2023-02-27 09:17:33.000000000 -0600
Change: 2023-02-27 17:08:52.609999192 -0600
Birth: -

[cryosparc_user@xingcryoem2 cryosparc2_master]$ cryosparcm status

CryoSPARC System master node installed at
/mnt/ssd/cryosparc_user/cryosparc2_master
Current cryoSPARC version: v4.2.0

CryoSPARC process status:

app RUNNING pid 107728, uptime 1:18:22
app_api RUNNING pid 107751, uptime 1:18:20
app_api_dev STOPPED Not started
app_legacy STOPPED Not started
app_legacy_dev STOPPED Not started
command_core RUNNING pid 107627, uptime 1:18:36
command_rtp RUNNING pid 107690, uptime 1:18:27
command_vis RUNNING pid 107656, uptime 1:18:29
database RUNNING pid 107520, uptime 1:18:40


License is valid

global config variables:
export CRYOSPARC_LICENSE_ID=“XXXXXXXXXXXX”
export CRYOSPARC_MASTER_HOSTNAME=“xingcryoem2.oncology.wisc.edu”
export CRYOSPARC_DB_PATH=“/mnt/ssd/cryosparc_user/cryosparc2_database”
export CRYOSPARC_BASE_PORT=39000
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_INSECURE=false
export CRYOSPARC_HEARTBEAT_SECONDS=180
export CRYOSPARC_DISABLE_IMPORT_ON_MASTER=false

Did you try any GPU jobs with the current state of the CryoSPARC instance?

We tried the patch motion cor and that is where we get ‘GPU not available’.

Please can can you post the output of these commands:

nvidia-smi
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call which nvcc
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call nvcc --version
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call python -c "import pycuda.driver; print(pycuda.driver.get_version())"
/mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw gpulist
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ pwd
/mnt/ssd/cryosparc_user/cryosparc2_worker
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call which nvcc
/usr/local/cuda-11/bin/nvcc
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw call python -c "import pycuda.driver; print(pycuda.driver.get_version())"
(11, 5, 0)
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ /mnt/ssd/cryosparc_user/cryosparc2_worker/bin/cryosparcw gpulist
  Detected 4 CUDA devices.

   id           pci-bus  name
   ---------------------------------------------------------------
       0      0000:1B:00.0  NVIDIA GeForce RTX 2080 Ti
       1      0000:3E:00.0  NVIDIA GeForce RTX 2080 Ti
       2      0000:88:00.0  NVIDIA GeForce RTX 2080 Ti
       3      0000:B1:00.0  NVIDIA GeForce RTX 2080 Ti
   ---------------------------------------------------------------
[cryosparc_user@xingcryoem2 cryosparc2_worker]$ nvidia-smi