New Cryosparc Install and Administration

CLUSTER

Hello Everyone! I am new to Cryosparc as a linux systems administrator (primarily on a HPC cluster). I have installed and configured the cryosparc_master and cryosparc_worker packages on my development environment and everything seems to be working correctly. However, I wanted to make some configuration changes and see if I am doing this correctly (specifically for HPC).

  1. Where should master be running?
    1. Should my “master” process be running on a head node for the cluster?
    2. Or a standalone server that only does the “master” process?
  2. Mulit-user HPC Environment?
    1. I installed and configured as cryosparcuser however, when Cryosparc launches jobs, we would like the job to be submitted under the user who submitted (given that the web ui username == server username).
    2. This is so we can track usage and have our scheduler (SLURM) utilize fairshare, QOS, and user tracking.
    3. I have attempted this by updating the cluster_info.json file to contain a su <username> -c 'sbatch {{ script_path_abs }}'
    4. This does not seem to play nicely (probably due to permissions)
    5. What is the recommended practice here?
  3. How to open Cryosparc at the user’s working directory?
    1. Should users be launching cryosparcm themselves (each with own LICENSE_ID)
    2. Or should the local cryosparcuser have the master process running at all times

Forgive me for any stupid questions, I have only started this project a few days ago and possibly could have missed this in the documentation.

See below for template information:

$ uname -a
Linux wi-hpc-hn-dev01 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

$ free -g
              total        used        free      shared  buff/cache   available
Mem:             15           2           3           0           9          11
Swap:            15           0          15

$ eval $(/applications/cryosparc/cryosparc_worker/bin/cryosparcw env)
$ env | grep PATH
LD_LIBRARY_PATH=/cm/shared/apps/slurm/current/lib64/slurm:/cm/shared/apps/slurm/current/lib64:/cm/local/apps/gcc/11.2.0/lib:/cm/local/apps/gcc/11.2.0/lib64:/cm/local/apps/gcc/11.2.0/lib32
__LMOD_REF_COUNT_PATH=/cm/shared/apps/slurm/current/sbin:1;/cm/shared/apps/slurm/current/bin:1;/cm/local/apps/gcc/11.2.0/bin:1;/applications/cryosparc/cryosparc_master/bin:1;/home/cryosparcuser/.local/bin:1;/home/cryosparcuser/bin:1;/usr/local/bin:1;/usr/bin:1;/usr/local/sbin:1;/usr/sbin:1;/opt/dell/srvadmin/bin:1
__LMOD_SET_FPATH=1
FPATH=/usr/share/lmod/lmod/init/ksh_funcs
__LMOD_REF_COUNT_LD_LIBRARY_PATH=/cm/shared/apps/slurm/current/lib64/slurm:1;/cm/shared/apps/slurm/current/lib64:1;/cm/local/apps/gcc/11.2.0/lib:1;/cm/local/apps/gcc/11.2.0/lib64:1;/cm/local/apps/gcc/11.2.0/lib32:1
__LMOD_REF_COUNT_MODULEPATH=/cm/local/modulefiles:1;/etc/modulefiles:1;/usr/share/modulefiles:1;/usr/share/Modules/modulefiles:1;/cm/shared/modulefiles:2
CRYOSPARC_PATH=/applications/cryosparc/cryosparc_worker/bin
CPATH=/cm/shared/apps/slurm/current/include
__LMOD_REF_COUNT_LIBRARY_PATH=/cm/shared/apps/slurm/current/lib64/slurm:1;/cm/shared/apps/slurm/current/lib64:1
LIBRARY_PATH=/cm/shared/apps/slurm/current/lib64/slurm:/cm/shared/apps/slurm/current/lib64
__LMOD_REF_COUNT_MANPATH=/cm/shared/apps/slurm/current/man:1;/usr/share/lmod/lmod/share/man:1;/usr/local/share/man:1;/usr/share/man:1;/cm/local/apps/environment-modules/current/share/man:1
__LMOD_REF_COUNT_CPATH=/cm/shared/apps/slurm/current/include:1
PYTHONPATH=/applications/cryosparc/cryosparc_worker
MANPATH=/cm/shared/apps/slurm/current/man:/usr/share/lmod/lmod/share/man:/usr/local/share/man:/usr/share/man:/cm/local/apps/environment-modules/current/share/man
MODULEPATH=/cm/local/modulefiles:/etc/modulefiles:/usr/share/modulefiles:/usr/share/Modules/modulefiles:/cm/shared/modulefiles
MODULEPATH_ROOT=/usr/share/modulefiles
NUMBA_CUDA_INCLUDE_PATH=/applications/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/include
PATH=/applications/cryosparc/cryosparc_worker/bin:/applications/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/applications/cryosparc/cryosparc_worker/deps/anaconda/condabin:/cm/shared/apps/slurm/current/sbin:/cm/shared/apps/slurm/current/bin:/cm/local/apps/gcc/11.2.0/bin:/applications/cryosparc/cryosparc_master/bin:/home/cryosparcuser/.local/bin:/home/cryosparcuser/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/dell/srvadmin/bin

$ uname -a
Linux <hostname> 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

$ free -g
              total        used        free      shared  buff/cache   available
Mem:            503           2         416           0          83         497
Swap:            15           0          15
$ nvidia-smi
Mon Sep  9 15:47:03 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                     On  |   00000000:17:00.0 Off |                    0 |
|  0%   23C    P8             23W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A40                     On  |   00000000:65:00.0 Off |                    0 |
|  0%   23C    P8             23W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A40                     On  |   00000000:CA:00.0 Off |                    0 |
|  0%   23C    P8             24W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A40                     On  |   00000000:E3:00.0 Off |                    0 |
|  0%   23C    P8             23W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Welcome to the forum @aharral .

Not necessarily. CryoSPARC master processes should be running on either a submission host for the cluster or on a host that can run cluster job control commands (job submission, termination, queries) remotely via ssh.

CryoSPARC jobs should run under the Linux account that “owns” the CryoSPARC master processes. To run CryoSPARC jobs under different Linux accounts, you may want to setup multiple CryoSPARC master instances. Each CryoSPARC instance would require a unique license ID. Multiple instances can run on the same host provided

  • their port ranges do not overlap
  • they have distinct database paths
  • no project directories are shared between instances

You may track usage independently of Linux account using the CryoSPARC-provided cryosparc_username cluster template variable.

cryosparcm commands should be run under the Linux account that “owns” the respective CryoSPARC instance. You also may want to ensure that CryoSPARC master processes are orderly shutdown with the
cryosparcm stop command when needed, not terminated abruptly.